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IMPORTANT NOTICE 


Texas Instruments Incorporated (Tl) reserves the right to make changes to its products or to 
discontinue any semiconductor product or service without notice, and advises its customers to 
obtain the latest version of relevant information to verify, before placing orders, that the 
information being relied on is current. 


Tl warrants performance of its semiconductor products and related software to current 
specifications in accordance with Tl’s standard warranty. Testing and other quality control 
techniques are utilized to the extent Tl deems necessary to support this warranty. Specific testing 
of all parameters of each device is not necessarily performed, except those mandated by 
government requirements. 


Please be aware that TI products are not intended for use in life-support appliances, devices, 
or systems. Use of TI product in such applications requires the written approval of the 
appropriate TI officer. Certain applications using semiconductor devices may involve potential 
risks of personal injury, property damage, or loss of life. In order to minimize these risks, 
adequate design and operating safeguards should be provided by the customer to minimize 
inherent or procedural hazards. Inclusion of TI products in such applications is understood to be 
fully at the risk of the customer using TI devices or systems. 


Tl assumes no liability for applications assistance, customer product design, software 
performance, or infringement of patents or services described herein. Nor does TI warrant or 
represent that any license, either express or implied, is granted under any patent right, copyright, 
mask work right, or other intellectual property right of Tl covering or relating to any combination, 
machine, or process in which such semiconductor products or services might be or are used. 
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About This Manual 


Preface 


Read This First 


This user’s guide serves as a reference book for the TMS320C40 and 
TMS320C44 digital signal processors. Throughout the book, all references to 
the TMS320C4x apply to both devices, except when otherwise noted. 


How to Use This Manual 


The following table summarizes the information contained in this user’s guide: 


If you are looking for 
information about: 


Addressing modes 
ARAUs 
Bootloader 


Bus Structure 


Cache 
Communication Ports 


CPU Architecture 


DMA 


Data Formats 


Delayed Branches 


Instruction set 


Turn to these chapters: 


Chapter 6, Addressing Modes 
Chapter 2, Architectural Overview 
Chapter 10, The Bootloader 


Chapter 2, Architectural Overview 


Chapter 9, External Bus Operation 
Chapter 4, Memory and the Instruction Cache 
Chapter 12, Communication Ports 


Chapter 2, Architectural Overview 
Chapter 3, CPU Registers 


Chapter 11, The DMA Coprocessor 


Chapter 5, Data Formats and Floating-Point Op- 
eration 


Chapter 7, Program Flow Control 


Chapter 14, Assembly Language Instructions 


Style and Symbol Conventions 


If you are looking for 


information about: Turn to these chapters: 
Interrupts Chapter 7, Program Flow Control 
Memory Chapter 2, Architectural Overview 


Chapter 4, Memory and the Instruction Cache 


Peripherals Chapter 12, Communication Ports 
Chapter 11, The DMA Coprocessor 
Chapter 13, Timers 


Overview of the 'C4x Chapter 1, Introduction 

Program control Chapter 7, Program Flow Control 
Pipeline Chapter 8, Pipeline Operation 
Registers Chapter 3, CPU Registers 


Chapter 12, Communication Ports 
Chapter 11, The DMA Coprocessor 
Chapter 13, Timers 


Repeat Mode Chapter 7, Program Flow Control 
Reset Chapter 7, Program Flow Control 
Timers Chapter 13, Timers 

Traps Chapter 7, Program Flow Control 


Style and Symbol Conventions 
This document uses the following conventions: 


Lj Program listings, program examples, file names, and symbol names are 
shown in a special font. Examples use a bold version of the special font 
for emphasis. Here is a sample program listing segment: 


* 


LOOP1 RPTB MAX 
CMPF * ARO, RO ;Compare number to the maximum 
MAX LDFLT *ARO,RO ;If greater, this is a new max 
B NEXT 
LOOP2 RPTB MIN 
CMPF *ARO++(1),RO ;Compare number to the minimum 
MIN LDFLT *-ARO(1),RO ;If smaller, this is new minimum 


NEXT 


Style and Symbol Conventions 


In syntax descriptions, the instruction is in bold face and the parameters 
are in italic face. Portions of a syntax that are in bold face should be 
entered as shown; portions of a syntax that are in italic face describe the 
type of information that should be entered. Here is an example of an 
instruction: 


CMPF3 src2,src1 


Notice that although the instruction mnemonic (CMPF3 in this example) is 
in capital letters, the 'C4x assembler is not case sensitive — it can 
assemble mnemonics entered in either upper or lower case. 


CMPFS3 is the instruction mnemonic. This instruction has two parameters, 
indicated by src2 and src7. 


Square brackets ( [ and ] ) identify an optional parameter. If you use an 
optional parameter, you must specify the information within the brackets; 
however, you don’t enter the brackets themselves. Here’s an example of 
an instruction that has an optional parameter: 


[abel LDP src [,DP| 


The LDP instruction is shown with two parameters; one is optional. The 
first parameter, src, is required. The second parameter, DP, and the label, 
are optional. As this syntax shows, if you use the optional second 
parameter, you must precede it with a comma. 


Throughout this book MSB indicates the most significant bit and LSB 
indicates the least significant bit. MS indicates the most significant byte 
and LS indicates the least significant byte. 
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Information About Cautions and Warnings 


Information About Cautions and Warnings 


This book may contain cautions and warnings. 


This is an example of a caution statement. 


A caution statement describes a situation that could potentially 
damage your software or equipment. 


This is an example of a warning statement. 


A warning statement describes a situation that could potentially 
cause harm to you. 


The information in a caution or a warning is provided for your protection. 
Please read each caution and warning carefully. 
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Related Documentation From Texas Instruments 


Related Documentation From Texas Instruments 


The following books describe the TMS320 floating-point devices and related 
support tools. To obtain a copy of any of these TI documents, call the Texas 
Instruments Literature Response Center at (800) 477-8924. When ordering, 
please identify the book by its title and literature number. 


TMS320C4x General-Purpose Applications User’s Guide (literature 
number SPRU159) describes software and hardware applications for 
the ’C4x processor. Also includes development support information, 
parts lists, and XDS510 emulator design considerations. 


TMS320C4x Parallel Processing Development System Technical 
Reference (literature number SPRUO75) describes the TMS320C4x 
parallel processing system, a system with four C4xs with shared and 
distributed memory. 


Parallel Processing with the TMS320C4x (literature number SPRA031) 
describes parallel processing and how the ’C4x can be used in parallel 
processing. Also provides sample parallel processing applications. 


TMS320 Floating-Point DSP Assembly Language Tools User’s Guide 
(literature number SPRUO35) describes the assembly language tools 
(assembler, linker, and other tools used to develop assembly language 
code), assembler directives, macros, common object file format, and 
symbolic debugging directives for the ’C3x and ’C4x generations of 
devices. 


TMS320 Floating-Point DSP Optimizing C Compiler User’s Guide 
(literature number SPRU034) describes the TMS320 floating-point C 
compiler. This C compiler accepts ANSI standard C source code and 
produces TMS320 assembly language source code for the ’C3x and 
’C4x generations of devices. 


TMS320C4x C Source Debugger User’s Guide (literature number 
SPRUO054) tells you how to invoke the ’C4x emulator and simulator 
versions of the C source debugger interface. This book discusses 
various aspects of the debugger interface, including window 
management, command entry, code execution, data management, and 
breakpoints. It also includes a tutorial that introduces basic debugger 
functionality. 


TMS320C4x Technical Brief (literature number SPRUO76) gives a 


condensed overview of the ’C4x DSP and its development tools. It also 
lists TMS320C4x third parties. 
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Related Articles and Books 


TMS320 Family Development Support Reference Guide (literature number 
SPRUO011) describes the ’320 family of digital signal processors and the 
various products that support it. This includes code-generation tools 
(compilers, assemblers, linkers, etc.) and system integration and debug 
tools (simulators, emulators, evaluation modules, etc.). This book also 
lists related documentation, outlines seminars and the university 
program, and gives factory repair and exchange information. 


TMS320 Third-Party Support Reference Guide (literature number 
SPRU052) alphabetically lists over 100 third parties that supply various 
products that serve the family of 320 digital signal processors—software 
and hardware development tools, speech recognition, image 
processing, noise cancellation, modems, etc. 


TMS320 DSP Designer’s Notebook: Volume 1 (SPRT125). Presents 
solutions to common design problems using ’C2x, ’C3x, ’C4x, ’C5x, and 
other Tl DSPs. 


Related Articles and Books 


viii 


A wide variety of related documentation is available on digital signal 
processing. These references fall into one of the following application 
categories: 


General-Purpose DSP 
Graphics/Imagery 
Speech/Voice 

Control 

Multimedia 

Military 
Telecommunications 
Automotive 
Consumer 

Medical 

Development Support 


DOUUOUUUUUOUU 


In the following list, references appear in alphabetical order according to 
author. The documents contain beneficial information regarding designs, 
operations, and applications for signal-processing systems; all of the 
documents provide additional references. Texas Instruments strongly 
suggests that you refer to these publications. 


General-Purpose DSP: 


1) Antoniou, A., Digital Filters: Analysis and Design, New York, NY: 


McGraw-Hill Company, Inc., 1979. 


Related Articles and Books 


2) Brigham, E.O., The Fast Fourier Transform, Englewood Cliffs, Nu: 
Prentice-Hall, Inc., 1974. 


3) Burrus, C.S., and T.W. Parks, DFT/FFT and Convolution Algorithms, New 
York, NY: John Wiley and Sons, Inc., 1984. 


4) Chassaing, R., Horning, D.W., “Digital Signal Processing with Fixed and 
Floating-Point Processors. ” COED, USA, Volume 1, Number 1, pages 1—4, 
March 1991. 


5) Defatta, David J., Joseph G. Lucas, and William S. Hodgkiss, Digital 
Signal Processing: A System Design Approach, New York: John Wiley, 
1988. 


6) Erskine, C., and S. Magar, “Architecture and Applications of a 
Second-Generation Digital Signal Processor.” Proceedings of IEEE 
International Conference on Acoustics, Speech, and Signal Processing, 
USA, 1985. 


7) Essig, D., C. Erskine, E. Caudel, and S. Magar, “A Second-Generation 
Digital Signal Processor.” /EEE Journal of Solid-State Circuits, USA, 
Volume SC-—21, Number 1, pages 86-91, February 1986. 


8) Frantz, G., K. Lin, J. Reimer, and J. Bradley, “The Texas Instruments 
TMS320C25 Digital Signal Microcomputer.” /EEE Microelectronics, USA, 
Volume 6, Number 6, pages 10-28, December 1986. 


9) Gass, W., R. Tarrant, T. Richard, B. Pawate, M. Gammel, P. Rajasekaran, 
R. Wiggins, and C. Covington, “Multiple Digital Signal Processor 
Environment for Intelligent Signal Processing.” Proceedings of the IEEE, 
USA, Volume 75, Number 9, pages 1246-1259, September 1987. 


10) Gold, Bernard, and C.M. Rader, Digital Processing of Signals, New York, 
NY: McGraw-Hill Company, Inc., 1969. 


11) Hamming, R.W., Digital Filters, Englewood Cliffs, NJ: Prentice-Hall, Inc., 
1977. 


12) IEEE ASSP DSP Committee (Editor), Programs for Digital Signal 
Processing, New York, NY: IEEE Press, 1979. 


13) Jackson, Leland B., Digital Filters and Signal Processing, Hingham, MA: 
Kluwer Academic Publishers, 1986. 


14) Jones, D.L., and T.W. Parks, A Digital Signal Processing Laboratory Using 
the TMS32010, Englewood Cliffs, NJ: Prentice-Hall, Inc., 1987. 


15) Lim, Jae, and Alan V. Oppenheim, Advanced Topics in Signal Processing, 
Englewood Cliffs, NJ: Prentice- Hall, Inc., 1988. 
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Related Articles and Books 


16) Lin, K., G. Frantz, and R. Simar, Jr., “The TMS320 Family of Digital Signal 
Processors.” Proceedings of the IEEE, USA, Volume 75, Number 9, pages 
1143-1159, September 1987. 


17) Lovrich, A., Reimer, J., “An Advanced Audio Signal Processor.” Digest of 
Technical Papers for 1991 International Conference on Consumer 
Electronics, June 1991. 


18) Magar, S., D. Essig, E. Caudel, S. Marshall and R. Peters, “An NMOS 
Digital Signal Processor with Multiprocessing Capability.” Digest of IEEE 
International Solid-State Circuits Conference, USA, February 1985. 


19) Morris, Robert L., Digital Signal Processing Software, Ottawa, Canada: 
Carleton University, 1983. 


20) Oppenheim, Alan V. (Editor), Applications of Digital Signal Processing, 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1978. 


21) Oppenheim, Alan V., and R.W. Schafer, Digital Signal Processing, 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1975 and 1988. 


22) Oppenheim, A.V., A.N. Willsky, and I.T. Young, Signals and Systems, 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1983. 


23) Papamichalis, P.E., and C.S. Burrus, “Conversion of Digit-Reversed to 
Bit-Reversed Order in FFT Algorithms.” Proceedings of ICASSP 89, USA, 
pages 984-987, May 1989. 


24) Papamichalis, P., and R. Simar, Jr., “The TMS320C30 Floating-Point 
Digital Signal Processor.” JEEE Micro Magazine, USA, pages 13-29, 
December 1988. 


25) Parks, T.W., and C.S. Burrus, Digital Filter Design, New York, NY: John 
Wiley and Sons, Inc., 1987. 


26) Peterson, C., Zervakis, M., Shehadeh, N., “Adaptive Filter Design and 
Implementation Using the TMS320C25 Microprocessor.” Computers in 
Education Journal, USA, Volume 3, Number 3, pages 12-16, 
July-September 1993. 


27) Prado, J., and R. Alcantara, “A Fast Square-Rooting Algorithm Using a 
Digital Signal Processor.” Proceedings of IEEE, USA, Volume 75, Number 
2, pages 262-264, February 1987. 


28) Rabiner, L.R. and B. Gold, Theory and Applications of Digital Signal 
Processing, Englewood Cliffs, NJ: Prentice-Hall, Inc., 1975. 


29) Simar, Jr., R., and A. Davis, “The Application of High-Level Languages to 
Single-Chip Digital Signal Processors.” Proceedings of ICASSP 88, USA, 
Volume D, page 1678, April 1988. 


Related Articles and Books 


30) Simar, Jr., R., T. Leigh, P. Koeppen, J. Leach, J. Potts, and D. Blalock, “A 


40 MFLOPS Digital Signal Processor: the First Supercomputer on a Chip.” 
Proceedings of ICASSP 87, USA, Catalog Number 87CH2396-0, Volume 
1, pages 535-538, April 1987. 


31) Simar, Jr., R., and J. Reimer, “The TMS320C25: a 100 ns CMOS VLSI 


Digital Signal Processor.” 1986 Workshop on Applications of Signal 
Processing to Audio and Acoustics, September 1986. 


32) Texas Instruments, Digital Signal Processing Applications with the 


TMS320 Family, 1986; Englewood Cliffs, NJ: Prentice-Hall, Inc., 1987. 


33) Treichler, J.R., C.R. Johnson, Jr., and M.G. Larimore, A Practical Guide 


to Adaptive Filter Design, New York, NY: John Wiley and Sons, Inc., 1987. 


Graphics/Imagery: 


Andrews, H.C., and B.R. Hunt, Digital Image Restoration, Englewood 
Cliffs, NJ: Prentice-Hall, Inc., 1977. 


Gonzales, Rafael C., and Paul Wintz, Digital Image Processing, Reading, 
MA: Addison-Wesley Publishing Company, Inc., 1977. 


Papamichalis, P.E., “FFT Implementation on the TMS320C30.” 
Proceedings of ICASSP 88, USA, Volume D, page 1399, April 1988. 


Pratt, William K., Digital Image Processing, New York, NY: John Wiley and 
Sons, 1978. 


Reimer, J., and A. Lovrich, “Graphics with the TMS32020.” WESCON/85 
Conference Record, USA, 1985. 


Speech/Voice: 


1) 


DellaMorte, J., and P. Papamichalis, “Full-Duplex Real-Time 
Implementation of the FED-STD-1015 LPC-10e Standard V.52 on the 
TMS320C25.” Proceedings of SPEECH TECH 89, pages 218-221, May 
1989. 


Frantz, G.A., and K.S. Lin, “A Low-Cost Speech System Using the 
TMS320C17.” Proceedings of SPEECH TECH ’87, pages 25-29, April 
1987. 


Gray, A.H., and J.D. Markel, Linear Prediction of Soeech, New York, NY: 
Springer-Verlag, 1976. 

Jayant, N.S., and Peter Noll, Digital Coding of Waveforms, Englewood 
Cliffs, NJ: Prentice-Hall, Inc., 1984. 


Papamichalis, Panos, Practical Approaches to Speech Coding, Engle- 
wood Cliffs, NJ: Prentice-Hall, Inc., 1987. 


Papamichalis, P., and D. Lively, “Implementation of the DOD Standard 
LPC-—10/52E on the TMS320C25.” Proceedings of SPEECH TECH ’87, 
pages 201-204, April 1987. 
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Related Articles and Books 
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9) 


Pawate, B.I., and G.R. Doddington, “Implementation of a Hidden Markov 
Model-Based Layered Grammar Recognizer.” Proceedings of ICASSP 
89, USA, pages 801-804, May 1989. 


Rabiner, L.R., and R.W. Schafer, Digital Processing of Speech Signals, 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1978. 


Reimer, J.B. and K.S. Lin, “TMS320 Digital Signal Processors in Speech 
Applications.” Proceedings of SPEECH TECH ’88, April 1988. 


10) Reimer, J.B., M.L. McMahan, and W.W. Anderson, “Speech Recognition 


for a Low-Cost System Using a DSP.” Digest of Technical Papers for 1987 
International Conference on Consumer Electronics, June 1987. 


Control: 


1) 


2) 


9) 


Ahmed, |., “16-Bit DSP Microcontroller Fits Motion Control System 
Application.” PC/M, October 1988. 


Ahmed, |., “Implementation of Self Tuning Regulators with TMS320 
Family of Digital Signal Processors.” MOTORCON ‘88, pages 248-262, 
September 1988. 


Ahmed, |., and S. Lindquist, “Digital Signal Processors: Simplifying 
High-Performance Control.” Machine Design, September 1987. 


Ahmed, |., and S. Meshkat, “Using DSPs in Control.” Control Engineering, 
February 1988. 


Allen, C. and P. Pillay, “TMS320 Design for Vector and Current Control of 
AC Motor Drives.” Electronics Letters, UK, Volume 28, Number 23, pages 
2188-2190, November 1992. 


Bose, B.K., and P.M. Szczesny, “A Microcomputer-Based Control and 
Simulation of an Advanced IPM Synchronous Machine Drive System for 
Electric Vehicle Propulsion.” Proceedings of IECON ’87, Volume 1, pages 
454-463, November 1987. 


Hanselman, H., “LQG-Control of a Highly Resonant Disc Drive Head 
Positioning Actuator.” [EEE Transactions on Industrial Electronics, USA, 
Volume 35, Number 1, pages 100-104, February 1988. 


Jacquot, R., Modern Digital Control Systems, New York, NY: Marcel 
Dekker, Inc., 1981. 


Katz, P., Digital Control Using Microprocessors, Englewood Cliffs, NJ: 
Prentice-Hall, Inc., 1981. 


10) Kuo, B.C., Digital Control Systems, New York, NY: Holt, Reinholt, and 


Winston, Inc., 1980. 


Related Articles and Books 


11) Lovrich, A., G. Troullinos, and R. Chirayil, “An All-Digital Automatic Gain 
Control.” Proceedings of ICASSP 88, USA, Volume D, page 1734, April 
1988. 


12) Matsui, N. and M. Shigyo, ‘Brushless DC Motor Control Without Position 
and Speed Sensors.” /EEE Transactions on Industry Applications, USA, 
Volume 28, Number 1, Part 1, pages 120-127, January—February 1992. 


13) Meshkat, S., and |. Ahmed, “Using DSPs in AC Induction Motor Drives.” 
Control Engineering, February 1988. 


14) Panahi, |. and R. Restle, ‘DSPs Redefine Motion Control.” Motion Control 
Magazine, December 1993. 


15) Phillips, C., and H. Nagle, Digital Control System Analysis and Design, 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1984. 


Multimedia: 


1) Reimer, J., ‘DSP-Based Multimedia Solutions Lead Way Enhancing Audio 
Compression Performance.” Dr. Dobbs Journal, December 1993. 


2) Reimer, J., G. Benbassat, and W. Bonneau Jr., “Application Processors: 
Making PC Multimedia Happen.” Silicon Valley PC Design Conference, 
July 1991. 


Military: 


1) Papamichalis, P., and J. Reimer, “Implementation of the Data Encryption 
Standard Using the TMS32010.” Digital Signal Processing Applications, 
1986. 


Telecommunications: 


1) Ahmed, |, and A. Lovrich, “Adaptive Line Enhancer Using the 
TMS320C25.” Conference Records of Northcon/86, USA, 14/3/1-10, 
September/October 1986. 


2) Casale, S., R. Russo, and G. Bellina, “Optimal Architectural Solution 
Using DSP Processors for the Implementation of an ADPCM Transcoder.” 
Proceedings of GLOBECOM ’89, pages 1267-1273, November 1989. 


3) Cole, C., A. Haoui, and P. Winship, “A High-Performance Digital Voice 
Echo Canceller on a SINGLE TMS32020.” Proceedings of ICASSP 86, 
USA, Catalog Number 86CH2243-4, Volume 1, pages 429-432, April 
1986. 


4) Cole, C., A. Haoui, and P. Winship, “A High-Performance Digital Voice 
Echo Canceller on a Single TMS32020.” Proceedings of IEEE 
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International Conference on Acoustics, Speech and Signal Processing, 
USA, 1986. 


5) Lovrich, A., and J. Reimer, “A Multi-Rate Transcoder.” Transactions on 
Consumer Electronics, USA, November 1989. 


6) Lovrich, A. and J. Reimer, “A Multi-Rate Transcoder.” Digest of Technical 
Papers for 1989 International Conference on Consumer Electronics, June 
7-9, 1989. 


7) Lu, H., D. Hedberg, and B. Fraenkel, “Implementation of High-Speed 
Voiceband Data Modems Using the TMS320C25.” Proceedings of 
ICASSP 87, USA, Catalog Number 87CH2396-0, Volume 4, pages 
1915-1918, April 1987. 


8) Mock, P., “Add DTMF Generation and Decoding to DSP— uP Designs.” 
Electronic Design, USA, Volume 30, Number 6, pages 205-213, March 
1985. 


9) Reimer, J., M. McMahan, and M. Arjmand, “ADPCM on a TMS320 DSP 
Chip.” Proceedings of SPEECH TECH 85, pages 246-249, April 1985. 


10) Troullinos, G., and J. Bradley, “Split-Band Modem Implementation Using 
the TMS32010 Digital Signal Processor.” Conference Records of 
Electro/86 and Mini/Micro Northeast, USA, 14/1/1-21, May 1986. 


Automotive: 


1) Lin, K., “Trends of Digital Signal Processing in Automotive.” /nternational 
Congress on Transportation Electronic (CONVERGENCE ’88), October 
1988. 


Consumer: 


1) Frantz, G.A., J.B. Reimer, and R.A. Wotiz, “Julie, The Application of DSP 
to a Product.” Speech Tech Magazine, USA, September 1988. 


2) Reimer, J.B., and G.A. Frantz, “Customization of a DSP Integrated Circuit 
for a Customer Product.” Transactions on Consumer Electronics, USA, 
August 1988. 


3) Reimer, J.B., PE. Nixon, E.B. Boles, and G.A. Frantz, “Audio 
Customization of a DSP IC.” Digest of Technical Papers for 1988 
International Conference on Consumer Electronics, June 8-10 1988. 


Medical: 


1) Knapp and Townshend, “A Real-Time Digital Signal Processing System 
for an Auditory Prosthesis.” Proceedings of ICASSP 88, USA, Volume A, 
page 2493, April 1988. 


Related Articles and Books 


2) Morris, L.R., and P.B. Barszczewski, “Design and Evolution of a 
Pocket-Sized DSP Speech Processing System for a Cochlear Implant and 
Other Hearing Prosthesis Applications.” Proceedings of ICASSP 88, USA, 
Volume A, page 2516, April 1988. 


Development Support: 


1) Mersereau, R., R. Schafer, T. Barnwell, and D. Smith, “A Digital Filter 
Design Package for PCs and TMS320.” MIDCON/84 Electronic Show and 
Convention, USA, 1984. 


2) Simar, Jr., R., and A. Davis, “The Application of High-Level Languages to 
Single-Chip Digital Signal Processors.” Proceedings of ICASSP 88, USA, 
Volume 3, pages 1678-1681, April 1988. 
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If You Need Assistance.../Trademarks 


If You Need Assistance. . . 


Trademarks 
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If you want to... 


Request more information about 
Texas Instruments Digital Signal 
Processing (DSP) products 


Order Texas Instruments 
documentation 


Ask questions about product 
operation or report suspected 
problems 


Obtain the source code in this 
user’s guide. 


Visit TI online, including 
TI&ME™, your own customized 
web page. 


Report mistakes or make com- 
ments about this or any other TI 
documentation. 


Do this. . . 


Write to: 

Texas Instruments Incorporated 
Market Communications Manager 
MS 736 

P.O. Box 1443 

Houston, Texas 77251-1443 


Call the TI Literature Response Center: 
(800) 477-8924 


Contact the DSP hotline: 

Phone: (713) 274-2320 

FAX: (713) 274-2324 

Electronic Mail: 4389750@mcimail.com. 


Call the TI BBS: 
(713) 274-2323 


Ftp from: 

ftp.ti.com 

log in as user ftp 

cd to /mirrors/tms320bbs 


Point your browser at: 
http://www.ti.com 


Send electronic mail to: 
comments@books.sc.ti.com 


Send printed comments to: 

Texas Instruments Incorporated 
Technical Publications Mgr., MS 702 
P.O. Box 1443 

Houston, Texas 77251-1443 


MS is a registered trademark of Microsoft Corp. 


MS-Windows is a registered trademark of Microsoft Corp. 


MS-DOS is a registered trademark of Microsoft Corp. 


OS/2 is a trademark of International Business Machines Corp. 


Sun and SPARC are trademarks of Sun Microsystems, Inc. 


VAX and VMS are trademarks of Digital Equipment Corp. 
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Chapter 1 


Introduction 


The TMS320C4x devices are 32-bit floating-point digital signal processors op- 
timized for parallel processing. The ’C4x family combines a high performance 
CPU and DMA controller with up to six communication ports to meet the needs 
of multiprocessor and I/O-intensive applications. All ’'C4x devices are compat- 
ible with TI’s multi-chip development environment. Each device contains an 
on-chip analysis module, which supports hardware breakpoints for parallel- 
processing development and debugging. The ’C4x family is source-code com- 
patible with the TMS320C8x family of floating-point DSPs. 
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TMS320C4x Devices 


1.1 TMS320C4x Devices 


The TMS320C4x family is made up of three different members: the 
TMS320C40, the TMS320LC40, and the TMS320C44. 


1.1.1. The TMS320C40 


The TMS320C40 is the original member of the ’C4x family. It features a CPU 
that can deliver up to 30 MIPS/60 MFLOPS with a maximum I/O bandwidth of 
384M bytes/s. The 'C40 has 2K words of on-chip RAM, 128 words of program 
cache and a bootloader. Two external buses provide an address reach of 4 gi- 
gawords of unified memory space. The ’C40 is available in a 325-pin CPGA 
package. 


1.1.2 The TMS320C44 


The TMS320C44 is a lower cost version of the ’C40, for parallel processing 
applications that are more price sensitive. The C44 features four communica- 
tion ports and has an external address reach of 32M words over two external 
buses. To further reduce cost, the ‘C44 comes in a 304-pin PQFP package. 
The TMS320C44 can deliver up to 30 MIPS/60 MFLOPS performance with a 
maximum I/O bandwidth of 384M bytes/s. The ’C44 is source-code compatible 
with the ’C40. 


1.1.3 The TMS320LC40 


The TMS320LC40 is the newest member of the ’C4x family. It is a low-power 
version of the ’C40 capable of delivering up to 40 MIPS/80 MFLOPS with a 
maximum I/O bandwidth of 488M bytes/s for high performance multiproces- 
sing applications. The ’LC40 is source-code compatible with the ‘C40 and 
C44. 


TCS, | 
Note: 


See the chapter, Develooment Support and Part Order Information, in the 
TMS320C4x General-Purpose Applications User’s Guide for device speeds, 


device availability information and part numbers. 
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1.2 Key Features of the TMS320C4x 


The TMS320C4x has several key features: 


a 


uu 


LCovoco wo 


Up to 40 MIPS/80 MFLOPS performance with 488-Mbytes/s I/O capability 


m_ IEEE floating-point conversion for ease of use 

m Register-based CPU 

mM Single-cycle byte and half-word manipulation capabilities 
m Divide and square root support for improved performance 


On-chip memory includes 2K words of SRAM, 128 words of program 
cache, and bootloader 


Two external buses providing an address reach of up to 4 gigawords 
Two memory-mapped 32-bit timers 

6 and 12 channel DMA 

Up to six communication ports for multiprocessor communication 


Idle mode for reduced power consumption 
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1.3. TMS320C40 and TMS320C44 Device Comparison 


Table 1-1 shows the major differences in features of the ‘C40 and ’C44. 


Table 1-1. Comparison of ’C40 and 'C44 Features 
Feature 
External local address bus 
External global address bus 
Address reach 
Number of comm ports 
Commport direction pin 


NMI with bus grant feature 
Individual comm port reset 


Package 


°C40 

31 pins 
31 pins 
4G x 32 
6 

no 


yes 
(for revisions > 5.0) 


yes 
(for revisions > 5.0) 


325-pin CPGA 


C44 

24 pins 
24 pins 
32M x 32 
4 

yes 


yes 


yes 


304-pin PQFP 


Od a T=] 0) (=) a 


Architectural Overview 


The ’C4x’s high performance is achieved through the precision and wide dy- 
namic range of the floating-point units, on-chip memory, a high degree of paral- 
lelism, communication ports, and the DMA coprocessor. 


This chapter gives an architectural overview of the ’C4x processor. Figure 2-1 
is a block diagram of the ’C4x. 
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Block Diagram 


Figure 2-1. TMS320C4x Block Diagram 
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Figure 2—1. TMS320C4x Block Diagram (Continued) 
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Central Processing Unit (CPU) 


2.1 Central Processing Unit (CPU) 


2.1.1 


The ’C4x’s CPU has a register-based architecture. The CPU consists of the 
several components: 


Floating-point/integer multiplier 

Arithmetic Logic Unit (ALU) 

32-bit barrel shifter 

Internal buses (CPU1/CPU2 and REG1/REG2) 
Auxiliary register arithmetic units (ARAUs) 
CPU register file 


UOUUOUCU 


Figure 2-2 shows the CPU’s components. 


Floating-Point/Integer Multiplier 


The multiplier performs single-cycle multiplications on 32-bit integer and 40-bit 
floating-point values. The ’C4x implementation of floating-point arithmetic al- 
lows for floating-point operations at fixed-point speeds via a 25-ns instruction 
cycle and a high degree of parallelism. To gain even higher throughput, you 
can use parallel instructions to perform a multiply and ALU operation in asingle 
cycle. 


When the multiplier performs floating-point multiplication, the inputs are 40-bit 
floating-point numbers, and the result is a 40-bit floating-point number. When 
the multiplier performs integer multiplication, the input data is 32 bits and yields 
either the 32 most-significant bits or the 32 least-significant bits of the resulting 
64-bit product. See Chapter 5, Data Formats and Floating-Point Operation, for 
detailed information on data formats and floating-point operation. 


2.1.2 Arithmetic Logic Unit (ALU) and Internal Buses 


2-4 


The ALU performs single-cycle operations on 32-bit integer, 32-bit logical, and 
40-bit floating-point data, including single-cycle integer and floating-point con- 
versions. Results of the ALU are always maintained in 32-bit integer or 40-bit 
floating-point formats. The barrel shifter is used to shift up to 32 bits left or right 
in a single cycle. 


Four internal buses, CPU1, CPU2, REG1, and REG2, carry two operands from 
memory and two operands from the register file, thus allowing parallel multi- 
plies and adds/subtracts on four integer or floating-point operands in a single 
cycle. 


Figure 2-2. Central Processing Unit (CPU) 


Central Processing Unit (CPU) 
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Central Processing Unit (CPU) 


2.1.3 Auxiliary Register Arithmetic Units (ARAUs) 


The two auxiliary register arithmetic units (ARAUO and ARAU1) can generate 
two addresses in a single cycle. The ARAUs operate in parallel with the multi- 
plier and ALU. They support addressing with displacements, index registers 
(IRO and IR1), and circular and bit-reversed addressing. See Chapter 6, Ad- 
dressing Modes, for a description of addressing modes. 


2.1.4 CPU Primary Register File 
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The ’C4x primary register file provides 32 registers in a multiport register file 
thatis tightly coupled to the CPU. Table 2-1 lists register names and functions, 
followed by the section number and page of each description. 


All of the primary register file registers can be operated upon by the multiplier 
and ALU and can be used as general-purpose registers. However, the regis- 
ters also have some special functions. For example, the 12 extended-preci- 
sion registers are especially suited for maintaining floating-point results. The 
eight auxiliary registers support a variety of indirect addressing modes and can 
be used as general-purpose 32-bit integer and logical registers. The remaining 
registers provide system functions such as addressing, stack management, 
processor status, interrupts, and block repeat. See Chapter 3, CPU Registers, 
for detailed information about CPU registers. See Chapter 6, Addressing 
Modes, for information about register usage in addressing. 


The extended-precision registers (RO—R11) are capable of storing and sup- 
porting operations on 32-bit integer and 40-bit floating-point numbers. Any in- 
struction that assumes that the operands are floating-point numbers uses bits 
39-0. If the operands are either signed or unsigned integers, only bits 31-0 
are used, and bits 39-32 remain unchanged. This is true for all shift operations. 
See Chapter 5, Data Formats and Floating-Point Operation, for extended-pre- 
cision register formats of floating-point and integer numbers. 


The 32-bit auxiliary registers (ARO—AR7) can be accessed by the CPU and 
modified by the two auxiliary register arithmetic units (ARAUs). The primary 
function of the auxiliary registers is the generation of 32-bit addresses. They 
can also be used as loop counters or as 32-bit general-purpose registers that 
can be modified by the multiplier and ALU. See Chapter 6, Addressing Modes, 
for detailed information and examples of the use of auxiliary registers in ad- 
dressing. 


Central Processing Unit (CPU) 


Table 2-1.CPU Primary Registers 


Assembler 
Syntax Assigned Function Name Subsection Page 
RO Extended-precision register 0 3.1.1 3-3 
R1 Extended-precision register 1 3.1.1 3-3 
R2 Extended-precision register 2 3.1.1 3-3 
R3 Extended-precision register 3 3.1.1 3-3 
R4 Extended-precision register 4 3.1.1 3-3 
R5 Extended-precision register 5 3.1.1 3-3 
R6 Extended-precision register 6 3.1.1 3-3 
R7 Extended-precision register 7 3.1.1 3-3 
R8 Extended-precision register 8 3.1.1 3-3 
RQ Extended-precision register 9 3.1.1 3-3 
R10 Extended-precision register 10 3.1.1 3-3 
R11 Extended-precision register 11 3.1.1 3-3 
ARO Auxiliary register 0 3.1.2 3-4 
AR1 Auxiliary register 1 3.1.2 3-4 
AR2 Auxiliary register 2 3.1.2 3-4 
AR3 Auxiliary register 3 3.1.2 3-4 
AR4 Auxiliary register 4 3.1.2 3-4 
AR5 Auxiliary register 5 3.1.2 3-4 
AR6 Auxiliary register 6 3.1.2 3-4 
AR7 Auxiliary register 7 3.1.2 3-4 
DP Data-page pointer 3.1.3 3-4 
IRO Index register 0 3.1.4 3-4 
IR1 Index register 1 3.1.4 3-4 
BK Block-size register 3.1.5 3-5 
SP System stack pointer 3.1.6 3-5 
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Table 2—1.CPU Primary Registers (Continued) 
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Assembler 

Syntax Assigned Function Name Subsection Page 
ST Status register 3.1.7 3-5 
DIE DMA Coprocessor interrupt enable 3.1.8 3-8 
NE Internal-interrupt enable register 3.1.9 3-11 
IIF IIOF flag register 3.1.10 3-13 
RS Repeat start address 3.1.11 3-16 
RE Repeat end address 3.1.11 3-16 
RC Repeat counter 3.1.11 3-16 


The data page pointer (DP) is a 32-bit register. The 16 LSBs of the data page 
pointer are used by the direct addressing mode as a pointer to the page of data 
being addressed. The ’C4x can address up to 64K pages, each page contain- 
ing 64K words. Use of the data page pointer is described in subsection 6.3, 
Direct Addressing, on page 6-5. 


The 32-bit index registers contain the value used by the auxiliary register 
arithmetic unit (ARAU) to compute an indexed address. See Section 6.4, Indi- 
rect Addressing, on page 6-6, and Section 6.9, Bit-Reversed Addressing, on 
page 6-32, for more information about the ARAU. 


The ARAU uses the 32-bit block size register (BK) in circular addressing to 
specify the data block size. Circular addressing is described in Section 6.8, Cir- 
cular Addressing, on page 6-27. 


The system stack pointer (SP) is a 32-bit register that contains the address 
of the top of the system stack. The SP always points to the last element pushed 
onto the stack. A push performs a preincrement, and a pop performs a post- 
decrement of the system stack pointer. The SP is manipulated by interrupts, 
traps, calls, returns, and the PUSH/PUSHF and POP/POPF instructions. See 
Section 1.4, System and User Stack Management, in the TMS320C4x Gener- 
al-Purpose Applications User’s Guide for information about managing the 
stacks. 


Central Processing Unit (CPU) 


The status register (ST) contains global information related to the state of the 
CPU. Typically, operations set the condition flags of the status register accord- 
ing to whether the result is zero, negative, etc. This includes register load and 
store operations as well as arithmetic and logical functions. When the status 
register is loaded, however, a bit-for-bit replacement is performed with the con- 
tents of the source operand, regardless of the state of any bits in the source 
operand. Therefore, following a load, the contents of the status register are 
identically equal to the contents of the source operand. This allows the status 
register to be easily saved and restored. See subsection 3.1.7, Status Register 
(ST), on page 3-5, for definitions of the status register bits. 


The DMA coprocessor interrupt enable register (DIE) is a 32-bit register 
containing 2- and 3-bit fields to designate the interrupt synchronization 
scheme for each of the six DMA channels. It allows each DMA channel to ser- 
vice a corresponding input communication port and output communication 
port. Also, each DMA channel can be synchronized with external interrupts or 
the on-chip timers. This register is described in subsection 3.1.8, DMA Copro- 
cessor Interrupt Enable Register (DIE), on page 3-8. 


The CPU internal interrupt enable register (IIE) is a 32-bit register that en- 
ables/disables interrupts for the six communication ports, both timers, and the 
six DMA coprocessor channels. The IIE is described in subsection 3.1.9, CPU 
Interrupt Enable Register (IIE), on page 3-11. 


The IIOF flag register (IIF) controls the function (general-purpose I/O or inter- 
rupt) of the four external pins (IIOFO to IT[OF3). It also contains timer/DMA inter- 
rupt flags. Subsection 3.1.10, /IOF Flag Register (IIF), on page 3-13, provides 
further description of this register. 


The 32-bit repeat counter (RC) register specifies the number of times a block 
of code is to be repeated when a block repeat is performed. When the proces- 
sor is operating in the repeat mode, the 32-bit repeat start address register 
(RS) contains the starting address of the block of program memory to be re- 
peated, and the 32-bit repeat end address register (RE) contains the ending 
address of the block to be repeated. Further information about these registers 
is in subsection 3.1.11, Block Repear (RS,RE) and Repeat Count (RC) Regis- 
ters, on page 3-16. 


The program counter (PC) is a 32-bit register containing the address of the 
next instruction to be fetched. Although the PC is not part of the CPU register 
file, it is a register that can be modified by instructions that modify the program 
flow. 
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2.1.5 CPU Expansion Register File 


Besides the CPU primary register file, the expansion register file contains two 
special registers that act as pointers: 


Lj The IVTP register points to the interrupt-vector table (IVT), which defines 
vectors for all interrupts. 


Lj The TVTP register points to the trap vector table (TVT), which defines vec- 
tors for 512 traps. 


These two registers are fully described in Section 3.2, CPU Expansion Regis- 
ter File on page 3-17. 


Memory Organization 


2.2 Memory Organization 


The total memory reach of the ’C4x is 4G 32-bit words. Program memory (on- 
chip RAM or ROM and external memory) as well as registers affecting timers, 
communication ports, and DMA channels are contained within this space. This 
allows tables, coefficients, program code, and data to be stored in either RAM 
or ROM. Thus, memory usage is maximized, and memory space allocated as 
desired. 


By manipulating one external pin (ROMEN), you can configure the first one- 
megaword area of memory (0000 0000h to OOOF FFFFh) to address the local 
address bus or to address the on-chip ROM when you use the bootloader (with 
remaining space reserved). This capability is further discussed in Section 4.1, 
Memory Map, on page 4-2. 


2.2.1. RAM, ROM, and Cache 


Figure 2-3 shows how the memory is organized on the ’C4x. RAM blocks 0 
and 1 are 4K bytes (1K x 32 bits) each. The ROM block is reserved and con- 
tains a bootloader. Each RAM and ROM block is capable of supporting two ac- 
cesses in a single cycle. The separate program buses, data buses, and DMA 
buses allow for parallel program fetches, data reads and writes, and DMA op- 
erations. For example: the CPU can access two data values in one RAM block 
and perform an external program fetch in parallel with the DMA coprocessor 
loading another RAM block, all within a single cycle. 


The reserved ROM block (upper right in Figure 2-3) contains a bootloader. 
This loader supports loading of program and data at reset time. Loading is from 
8-, 16-, or 32-bit wide memories or any one of the six communication ports. 
Chapter 10, The Bootloader, explains the bootloader in detail. 


A 128 x 32-bit instruction cache is provided to store often-repeated sections 
of code, thus greatly reducing the number of needed off-chip accesses. This 
allows for code to be stored off-chip in slower, lower-cost memories. By using 
the cache to execute your program, the external buses are freed for use by the 
DMA controller or CPU. 


For further information about memory and the instruction cache, see Section 
4.1, Memory Organization, and Section 4.3, Cache Memory. 
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Figure 2-3. Memory Organization 
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Memory Organization 


2.2.2 Memory Maps 


The memory map for each processor is shown in Figure 2—4 (’C40) and 
Figure 2—5 (’C44); for each processor, the level at the external pin ROMEN de- 
termines whether or not the first megaword of memory addresses the internal 
ROM or external memory. The maps illustrate the entire address space of the 
C40 and ’C44. 


The value of ROMEN affects only the first megaword of memory: 


(Lj A1 at external pin ROMEN causes internal ROM to be enabled at 0000h 
with the one-megaword space reserved (0000 0000h — OOOF FFFFh). 
This is shown in the right side of the figure. 


(1 AOatROMEN causes addresses 0000 0000h— O00F FFFFh to be acces- 
sible on the local bus. This is shown in the left side of the figure. 


The rest of the memory map is the same for either level of ROMEN: 


Lj] The second megaword of memory is devoted to peripherals (as shown in 
Figure 2-6). 

Lj Thethird megaword of memory contains the two 1K-word (4K-byte) blocks 
of RAM (BLKO and BLK1 as shown at 002F F800h — 002F FFFFh). 

Lj) The rest of the first 2 gigawords (0030 0000h — 7FFF FFFFh) is on the lo- 
cal bus (external). 

Lj The second 2 gigawords (8000 0000h — FFFF FFFFh) are on the global 
bus (external). 


Section 4.1, Memory Map, on page 4-2 describes the memory maps in greater 
detail. Section 9.2, Memory Interface Signals on page 9-3, and Section 9.3, 
Memory Interface Control Registers on page 9-6, discuss the local and global 
interfaces to memory. The peripheral bus map and the vector locations for re- 
set, interrupts, and traps are also explained in those sections. 


Caution 


Any access to a reserved area in the address space produces 


unpredictable results. Do not attempt to access reserved areas. 
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Figure 2-4. 'C40 Memory Map 
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Figure 2-5. C44 Memory Map 
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Figure 2-6. Peripheral Memory Map 


Address 


0010 0000h 
0010 O000Fh 


0010 0010h 
0010 001Fh 


0010 0020h 
0010 002Fh 


0010 0030h 
0010 003Fh 


0010 0040h 
0010 004Fh 


0010 0050h 
0010 005Fh 


0010 0060h 
0010 O06Fh 


0010 0070h 
0010 007Fh 


0010 0080h 
0010 008Fh 


0010 0090h 
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0010 OOBOh 
0010 OOBFh 


0010 00COh 
0010 OOCFh 


0010 00D0h 
0010 OODFh 


0010 OOEOh 
0010 OOEFh 


0010 OOFOh 
0010 OOFFh 


Peripheral 


Local and global port control (16 words) 


Analysis block registers (16 words) 


Timer 0 registers (16 words) 


Timer 1 registers (16 words) 


Communication port 0 (16 words) 
(‘C40 only) 
Communication port 1 (16 words) 
Communication port 2 (16 words) 
Communication port 3 (16 words) 
(‘C40 only) 


Communication port 4 (16 words) 


Communication port 5 (16 words) 


DMA coprocessor channel 0 (16 words) 


DMA coprocessor channel 1 (16 words) 


DMA coprocessor channel 2 (16 words) 


DMA coprocessor channel 3 (16 words) 


DMA coprocessor channel 4 (16 words) 


DMA coprocessor channel 5 (16 words) 


Described in 


Subsection 4.2.1, 
Figure 4—4, page 
4-6 


Subsection 4.2.2. 


Subsection 4.2.3, 
Figure 4—5, page 
4-7 


Subsection 4.2.4, 
Figure 4-6, page 
4-8 


Subsection 4.2.5, 
Figure 4—7, page 
4-9 
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2.2.3 Memory Aliasing (’C44 only) 


Memory aliasing occurs in the ’C44, since both the global and local ports on 
that device have 24 pins, instead of the 31 pins on each port in the ’C40. 
Memory aliasing causes the first 16 M of each address space to be repeated 
in the memory map. Memory on the local bus occupies, and is aliased, in the 
first 2 G of address space, and memory on the global bus occupies, and is 
aliased, in the second 2 G of address space. Figure 2—7 shows the alias re- 
gions on the local and global buses. 


Figure 2—7. Memory Aliasing (‘C44 only) 


Local bus Global bus 
0x0000 0000 0x8000 0000 
Base address Base address 
region region 
OxOOFF FFFF Ox80FF FFFF 
0x01000000 0x81000000 
Alias 1 Alias 1 
0x01FF FFFF 0x81FFFFFF 
0x0200 0000 0x82000000 
Alias 2 Alias 2 
Ox02FF FFFF = Ox82FF FFFF . 
e e 
e e 
0x7F00 0000 OxFFO0 0000 
Alias n Alias n 
Ox7FFFFFFF OxFFFFFFFF 
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2.2.4 Memory Addressing Modes 


The ’C4x supports a base set of general-purpose instructions as well as arith- 
metic-intensive instructions that are particularly suited for digital signal pro- 
cessing and other numeric-intensive applications. Refer to Chapter 6, Addres- 
sing Modes, for detailed information on addressing. 


Four groups of addressing modes are provided on the ’C4x. Each group uses 
two or more of several different addressing types. The following list shows the 
addressing modes with their addressing types. 


a 


General addressing modes: 
m Register. The operand is a CPU register. 
m@ Immediate. The operand is a 16-bit immediate value. 


m Direct. The operand is the contents of a 32-bit address 
(concatenation of 16 bits of the data page pointer and a 16-bit 
operand). 


m@ Indirect. A 32-bit auxiliary register indicates the address of the 
operand. 


Three-operand addressing modes: 
m Register. (same as for general addressing mode). 
m Indirect. (same as for general addressing mode). 


m@ Immediate. The operand is an 8-bit immediate value. 


Parallel addressing modes: 
m Register. The operand is an extended-precision register. 


m Indirect. (same as for general addressing mode). 


Branch addressing modes: 
m Register. (same as for general addressing mode). 


m PC-relative. A signed 16-bit displacement ora 24-bit displacement is 
added to the PC. 


Internal Bus Operation 


2.3 Internal Bus Operation 


Alarge portion of the ’C4x’s high performance is due to internal busing and par- 
allelism. Separate buses allow for parallel program fetches, data accesses, 
and DMA accesses: 


_j Program buses PADDR and PDATA 
_) Data buses DADDR1, DADDR2, and DDATA 
_) DMA buses DMAADDR and DMADATA 


These buses connect all of the physical spaces (on-chip memory, off-chip 
memory, and on-chip peripherals) supported by the ’C4x. Figure 2-3 shows 
these internal buses and their connections to on-chip and off-chip memory 
blocks. 


The program counter (PC) is connected to the 32-bit program address bus 
(PADDR). The instruction register (IR) is connected to the 32-bit program data 
bus (PDATA). In this configuration, the buses can fetch a single instruction 
word every machine cycle. 


The 32-bit data address buses (DADDR1 and DADDR2) and the 32-bit data 
data bus (DDATA) support two data memory accesses every machine cycle. 
The DDATA bus carries data to the CPU over the CPU1 and CPU2 buses. The 
CPU1 and CPU2 buses can carry two data memory operands to the multiplier, 
ALU, and register file every machine cycle. Also internal to the CPU are regis- 
ter buses REG1 and REG2, which can carry two data values from the register 
file to the multiplier and ALU every machine cycle. Figure 2—2 shows the buses 
that are internal to the CPU section of the processor. 


The DMA controller is supported with a 32-bit address bus (DMAADDR) and 
a 32-bit data bus (DMADATA). These buses allow the DMA to perform memory 
accesses in parallel with the memory accesses occurring from the data and 
program buses. 
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2.4 External Bus Operation 
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The ’C4x provides two identical external interfaces: the global memory inter- 
face and the local memory interface. Each consists of a 32-bit data bus, a 
31-bit (C40) or 24-bit ((C44) address bus, and two sets of control signals. Both 
buses can be used to address external program/data memory or I/O space. 
The buses also have external RDY signals for wait-state generation with wait 
states inserted under software control. Chapter 9, External Bus Operation, 
covers external bus operation. 


For multiple processors to access global memory and share data in a coherent 
manner, arbitration is necessary. This arbitration (handshaking) is the purpose 
of the ’C4x’s interlocked operations, handled through interlocked instruc- 
tions. For more information about interlocked instructions, see Section 9.7 on 
page 9-39, Interlocked Operations. 


2.5 


Interrupts 


Interrupts 


The ’C4x supports four external interrupts (IIOF3—0), a number of internal in- 
terrupts, a nonmaskable external NMI interrupt, and a nonmaskable external 
RESET signal, which sets the processor to a known state. The DMA and com- 
munication ports have their own internal interrupts. When the CPU responds 
to the interrupt, the [ACK pin can be used to signal an external interrupt ac- 
knowledge. Section 7.4, on page 7-15, /nterrupts, covers RESET and interrupt 
processing. 
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2.6 Peripherals 


All’C4x on-chip peripherals are controlled through memory-mapped registers 
on adedicated peripheral bus. This peripheral bus is composed of a 32-bit data 
bus and a 32-bit address bus. This peripheral bus permits straightforward 
communication to the peripherals. The ’C4x peripherals include two timers 
and six (’C40) or four ((C44) communication ports. Figure 2-8 shows the pe- 
ripherals with associated buses and signals. 


Figure 2-8. Peripheral Modules 
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2.6.1 Communication Ports 


Six (’'C40) or four (C44) high-speed communication ports provide rapid pro- 
cessor-to-processor communication through each port’s dedicated communi- 
cation interfaces. Coupled with the ’C4x’s two memory interfaces (global and 
local), this allows you to construct a parallel processor system that attains opti- 
mum system performance by distributing tasks among several processors. 
Each ’C4x can pass the results of its work to another ’C4x through a commu- 
nication port, enabling each ’C4x to continue working. Chapter 12, Commu- 
nication Ports, explains communication port operation in detail. 


The communication ports offer several features: 


Lj 160-megabits/s (20-Mbytes or 5-Mwords per second) bidirectional data 
transfer operations (at 40-ns cycle time) 


1 Simple processor-to-processor communication via eight data lines and 
four control lines 


_j Buffering of all data transfers, both input and output 


_j Automatic arbitration to ensure communication synchronization 


[J Synchronization between the CPU or the direct-memory access (DMA) 
coprocessor and the six communication ports via internal interrupts and 
internal ready signals. 


(1 Port direction pin (CDIR) to ease interfacing (C44 only) 


2.6.2 Direct Memory Access (DMA) Coprocessor 


The six channels of the on-chip DMA coprocessor can read from or write to any 
location in the memory map without interfering with the operation of the CPU. 
This allows interfacing to slow external memories and peripherals without re- 
ducing throughput to the CPU. The DMA coprocessor contains its own ad- 
dress generators, source and destination registers, and transfer counter. Ded- 
icated DMA address and data buses allow for minimization of conflicts be- 
tween the CPU and the DMA coprocessor. A DMA operation consists of a 
block or single-word transfer to or from memory. A key feature of the DMA 
coprocessor is its ability to automatically reinitialize each channel following a 
data transfer. See Chapter 11, The DMA Coprocessor, for detailed information 
on the DMA coprocessor. 
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2.6.3 Timers 
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The two timer modules are general-purpose 32-bit timer/event counters with 
two signaling modes and internal or external clocking. They can signal inter- 
nally to the ’C4x or externally to the outside world at specified intervals, or they 
can count external events. Each timer has an I/O pin that can be used as an 
input clock to the timer, as an output signal driven by the timer, or as a general- 
purpose I/O pin. The timers are described in detail in Chapter 13, The Timers. 


Chapter 3 


CPU Registers 


The CPU primary register file contains 32 registers that can be used as oper- 
ands by the multiplier and ALU (arithmetic logic unit). The register file includes 
the auxiliary registers, extended-precision registers, and index registers. 
These registers support addressing, floating-point/integer operations, stack 
management, processor status, block repeats, branching, and interrupts. 


The CPU expansion register file contains two registers — the interrupt vector 
table pointer (IVTP) and the trap vector table pointer (TVTP). 


This chapter describes each of the CPU registers. 


Topic Page 
3.1. CPU Primary Register File ....................... 02 eee eens 3-2 
3.2 CPU Expansion Register File .....................00-0 ee ee 3-17 
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3.1 CPU Primary Register File 


The ’C4x provides 32 registers in a multiport register file that is tightly coupled 
to the CPU. The PC (program counter) is not included in the register file. 
The contents of the register file are listed in Table 3-1. 


Table 3—1.CPU Primary Register File 


Register 
Register Machine 
Symbol Value (hex) Assigned Function Name Subsection Page 
RO 00 Extended-precision register 0 3.1.1 3-3 
R1 01 Extended-precision register 1 3.1.1 3-3 
R2 02 Extended-precision register 2 3.1.1 3-3 
R3 03 Extended-precision register 3 3.1.1 3-3 
R4 04 Extended-precision register 4 3.1.1 3-3 
R5 05 Extended-precision register 5 3.1.1 3-3 
R6 06 Extended-precision register 6 3.1.1 3-3 
R7 07 Extended-precision register 7 3.1.1 3-3 
R8 1c Extended-precision register 8 3.1.1 3-3 
RQ 1D Extended-precision register 9 3.1.1 3-3 
R10 1E Extended-precision register 10 3.1.1 3-3 
R114 1F Extended-precision register 11 3.1.1 3-3 
ARO 08 Auxiliary register 0 3.1.2 3-4 
AR1 09 Auxiliary register 1 3.1.2 3-4 
AR2 OA Auxiliary register 2 3.1.2 3-4 
AR3 0B Auxiliary register 3 3.1.2 3-4 
AR4 0c Auxiliary register 4 3.1.2 3-4 
AR5 0D Auxiliary register 5 3.1.2 3-4 
AR6 OE Auxiliary register 6 3.1.2 3-4 
AR7 OF Auxiliary register 7 3.1.2 3-4 
DP 10 Data-page pointer 3.1.3 3-4 
IRO 11 Index register 0 3.1.4 3-4 
IR1 12 Index register 1 3.1.4 3-4 
BK 13 Block-size register 3.1.5 3-5 
SP 14 System stack pointer 3.1.6 3-5 
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Table 3-1. CPU Primary Register File (Continued) 


Register 
Register | Machine See On 
Symbol Value (hex) Assigned Function Name Subsection Page 
ST 15 Status register 
; 3.1.7 3-5 
DIE 16 DMA coprocessor interrupt en- 
able 3.1.8 3-8 
NE 17 Internal-interrupt enable register 3.1.9 3-11 
IF 18 IIOF flag register (IIOF3-0 pins, 3.1.10 3-13 
timers, DMA) 
RS 19 Repeat start address 3.1.11 3-16 
RE 1A Repeat end address 3.1.11 3-16 
RC 1B Repeat counter 3.1.11 3-16 


All of these registers can be used both as operands by the multiplier and ALU, 
and as general-purpose 32-bit registers. However, the registers also perform 
some special functions. For example, the 12 extended-precision registers 
maintain extended-precision floating-point results. The eight auxiliary regis- 
ters support a variety of indirect addressing modes and can be used as gener- 
al-purpose 32-bit integer and logical registers. The remaining registers pro- 
vide system functions such as addressing, stack management, processor sta- 
tus, interrupts, and block repeat. Refer to Chapter 6, Addressing Modes, for 
detailed information and examples of how CPU registers are used in address- 
ing. 


3.1.1. Extended-Precision Registers (RO—-R11) 


The 12 extended-precision registers (RO—R11) can store and support opera- 
tions on 32-bit integer and 40-bit floating-point numbers. 


For floating-point numbers, these registers consist of two separate and distinct 
fields: 


_j Bits 39-32: store the exponent (e) of a floating-point number. 


Lj Bits 31-0: store the mantissa of a floating-point number: 
@ Bit 31: sign bit (s), 
m Bits 30-0: the fraction (f). 


Any instruction that assumes that the operands are floating-point numbers 
uses bits 39-0. Figure 3-1 illustrates the storage of 40-bit floating-point num- 
bers in the extended-precision registers. 
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Figure 3-1. Extended: 


-Precision Register Floating-Point Format 


39 32 31 30 0 
le mantissa >| 


For integer operations, bits 31-0 of the extended-precision registers contain 
the integer (signed or unsigned). Any instruction that assumes that the oper- 
ands are either signed or unsigned integers uses only bits 31-0. Bits 39-32 
remain unchanged. This is true for all shift operations. The storage of 32-bit 
integers in the extended-precision registers is shown in Figure 3-2. 


Figure 3-2. Extended-Precision Register Integer Format 


3.1.2 Auxiliary Reg 


3.1.3. Data-Page Po 


39 32 31 


signed or unsigned integer 


isters (ARO-AR7) 


The eight 32-bit auxiliary registers (ARO—AR7) can be accessed by the CPU 
and modified by the two auxiliary register arithmetic units (ARAUs). The prima- 
ry function of the auxiliary registers is the generation of 32-bit addresses. How- 
ever, they can also operate as loop counters in indirect addressing or as 32-bit 
general-purpose registers that can be modified by the multiplier and ALU. See 
Chapter 6, Addressing Modes, for detailed information and examples of the 
use of auxiliary registers in addressing. 


oO 


inter (DP) 


The data-page pointer (DP) is a 32-bit register whose 16 LSBs are used by the 
direct addressing mode as a pointer to the page of data being addressed. Data 
pages are 64K words long with a total of 64K (65,536) pages. Bits 31-16 are 
reserved, they are always read as zeros and should not be modified by writing 
to the register. The DP can be loaded by using the LDP pseudoinstruction or 
the LDI instruction. Figure 6-1, on page 6-5, describes this register’s func- 
tions. 


3.1.4 Index Registers (IRO, IR1) 
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The 32-bit index registers (IRO and IR1) are used by the auxiliary register arith- 
metic unit (ARAU) for indexing the address. IRO is also used for bit-reversed 
addressing. See Chapter 6, Addressing Modes, for detailed information and 
examples of the use of index registers in addressing. Section 6.4, Indirect Ad- 
dressing, on page 6-6, discusses and provides examples of using IRn in indi- 
rect addressing. Section 6.9, Bit-Reversed Addressing, on page 6-32, de- 
scribes using IRn with bit-reversed addressing. 
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3.1.5 Block-Size Register (BK) 


The 32-bit block-size register (BK) is used by the ARAU in circular addressing 
to specify the data block size (see Section 6.8, Circular Addressing, on page 
6-27, for more information about the use of the BK register). 


3.1.6 System Stack Pointer (SP) 


The system stack pointer (SP) is a 32-bit register that contains the address of 
the top of the system stack. The SP always points to the last element pushed 
onto the stack. The SP is manipulated by interrupts, traps, calls, returns, and 
the PUSH, PUSHF, POP, and POPF instructions. Pushes and pops of the stack 
perform preincrement and postdecrement, respectively, on all 32 bits of the SP. 


3.1.7 Status Register (ST) 


The status register (ST) contains global information about the CPU’s state. 
Typically, load, store, arithmetic, and logical operations affect the ST’s condi- 
tion flags. When the ST is loaded, the contents of the load instruction’s source 
operand replace the ST’s current contents, regardless of the state of any bit(s) 
in the source operand. Therefore, following an ST load, the contents of the ST 
are identical to the contents of the source operand. This allows the status reg- 
ister to be saved easily and restored. At system reset, 0 is written to the ST; 
after reset, the CF bit is set to 1. The format of the ST is shown in Figure 3-3. 
The text following the figure describes each field in the ST. 


Figure 3—3. Status Register (ST) 


18 


19 17 16 
R/W R R 


31 30 29 28 27 26 25 24 23 22 21 20 
R R R R R R R R R R R R 
15 14 13 12 11 10 9 8 Li 6 5 4 3 2 1 0 
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W 
NOTE: xx = reserved bit. R = read, W = write. 


Cc Carry-condition flag. 

Vv Overflow condition flag. 

Z Zero condition flag. 

N Negative condition flag. 

UF Floating-point underflow condition flag. 

LV Latched overflow condition flag. 

LUF Latched floating-point underflow condition flag. 
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OVM 


RM 


PCF 


CF 


CE 


cc 


Overflow mode (OVM) flag. This flag affects only integer operations. 
If OVM = 0, the overflow mode is turned off. 


If OVM = 1, integer results overflowing in the positive direction are set to the 
most positive 32-bit twos-complement number (7FFF FFFFh), and integer 
results overflowing in the negative direction are set to the most negative 
32-bit twos-complement number (8000 0000h). 


Note that the functions of bits V and LV are independent of the setting of OVM. 


Repeat mode (RM) flag. If RM = 1, the PC is modified in either the repeat- 
block or repeat-single mode. 


Previous state of bit CF. When a trap executes or an interrupt is taken, the 
CF bit is set to 1 and the PCF bit is set to the CF bit’s previous value. 

The RETI and RETID instructions, explained in chapter 14, Assembly Lan- 
guage Instructions, copy PCF to the CF bit. 

Cache freeze (CF). Enables or disables updating of the cache. 


Set CF = 1 to freeze the cache. If CF = 1 and CE = 1, fetches from the cache 
and cache clearing (CC = 1) are allowed, but modification of the cache con- 
tents is not allowed. At reset, this bit is cleared to zero; itis setto 1 after reset. 


When CF = 0, the cache is automatically updated by instruction fetches from 
external memory and cache clearing (CC = 1) is allowed. Traps and interrupts 
set CF. The RETI and RETID instructions copy the PCF bit to the CF bit. 


Table 3-2 summarizes the CE and CF bits. 


Cache enable (CE). CE enables or disables the instruction cache. 


Set CE = 1 to enable the cache, allowing the cache to be used according to 
the LRU (least recently used) cache algorithm. 


Set CE = 0 to disable the cache, preventing cache modifications and fetches. 
Cache clearing (CC = 1) is allowed when CE = 0. At reset, 0 is written to CE. 


Cache clear. CC = 1 invalidates all entries in the cache (contents not guaran- 
teed). This bit is always cleared after it is written to and thus always read as 
0. At reset, 0 is written to this bit. All cache P flags = 0 when cache is cleared. 
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Table 3-2. Summary of the CE and CF Bits 


GIE 


PGIE 


SET COND 


(SC) 


ANALYSIS 


NMI bus 
grant 


XX 


CE CF Effect 
0 0 Cache not enabled 
0 1 Cache not enabled 
1 0 Cache enabled and not frozen 
1 1 Cache enabled but frozen (cache read only) 


Global interrupt enable. Enables or disables all maskable interrupts. 
If GIE = 1, the CPU responds to any enabled interrupts. 


If GIE = 0, the CPU does not respond to any enabled interrupts. This bit does 
not affect NMIs. The IDLE, LAT, RETI, RETID, and TRAP instructions affect this 
bit’s value. GIE is cleared to 0 when a trap is executed or an interrupt is taken. 


Previous state of bit GIE. When a trap executes or an interrupt is taken, bit 
GIE is cleared to 0. When this occurs, the PGIE bit is set to the GIE bit’s value 
before the trap or interrupt. Note that the RETIcond and RETIconabD instruc- 
tions copy PGIE to the GIE bit. At reset, this bit is cleared to 0. 


This bit determines how condition flags (ST bits O—6) are set. 


If SET COND = 0, condition flags are set if the operation’s target is any ex- 
tended-precision register (RO—R11). This setting makes the ’C4x similar to 
the ’C3x, regarding condition flag settings. This bit is cleared to 0 at reset. 


If SET COND = 1, condition flags are set if the target of the operation is any 
register in the primary register files exceptthe status register. Condition flags 
are always set when a CMPF, CMPI, CMPF8, CMPI3, TSTB, or TSTB3 in- 
struction is executed, regardless of the value of SET COND. 


This read-only bit is used in analysis mode to provide state information for 
emulation. 


(‘C44 and ’C40 revision >5.0 only) 


The NMI bus-grant feature is useful in correcting communication-port errors 
when used with the communication-port software reset feature. If bit 19 = 1 
and bit 18 = 0, an internal peripheral bus-grant signal is forced on the falling 
edge of NMI. If NMI is asserted when the peripheral bus is in a stall condition, 
the NMI breaks the pending cycle and then jumps to the NMI service routine. 
A stall condition may occur when writing to a full output FIFO, or when read- 
ing from an empty input FIFO. 


Reserved. Value undefined. These bits are read-only. 
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3.1.8 DMA Coprocessor Interrupt Enable Register (DIE) 


3.1.8.1 Unified Mode 


The 32-bit DMA interrupt enable register (DIE) is broken into six subfields that de- 
termine which interrupts can be used to control the synchronization for each of 
the six DMA coprocessor channels. Synchronization controls when a DMA chan- 
nel reads or writes. At reset, zeros are written to all register bits. 


Each DMA channel looks not only at the DMA synchronous interrupts selected 
but also at the synchronization mode that the channel is currently using (see 
Table 11-3). The synchronization mode is specified by the SYNC MODE field 
in the DMA channel control registers located in the DMA coprocessor. 


By using interrupt synchronization, each DMA channel can (for example) service 
a corresponding communication port. Note that DMAi can be synchronized only 
to signals coming from communication port i (where 0 <i < 5). Also, each DMA 
channel can be synchronized to external interrupts and to the on-chip timers. 


Figure 3-4 shows the DMA interrupt enable register for unified mode. 
Table 3-3 summarizes the interrupt activity for each of the four possible com- 
binations of DMAO and DMA‘1 for unified mode. Table 3—4 summarizes the in- 
terrupts enabled by three-bit values in DMA2 through DMAS for unified mode. 


Figure 3—4. DMA Interrupt Enable Register Bit Functions for DMA Unified Mode 


31 30 29 


28 27 26 25 24 23 22 21 20 


DMAS Write DMAS Read DMA4 Write DMA4 Read 


R/W R/W R/W 


R/W R/W R/W 


R/W R/W R/W R/W R/W R/W R/W R/W R/W 


16 15 14 13 12 11 10 9 8 


R/W R/W R/W R/W R/W R/W R/W R/W R/W 


DMA1 Write DMA1 Read DMAO Write DMAO Read 
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R/W R/W 


R = Read W = Write 


R/W R/W R/W R/W R/W R/W 
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Table 3—3. DMA Channels 0 and 1 (DMAO and DMA1) Unified Mode Synchronization Interrupts 


Interrupt Enabled at DMAO or DMA1 


Bit Value (in "DMAO. DMAOQ | DMA1  DMA1 Interrupt Source for 
DMAO or DMA1) Read Write Read Write DMA Synchronization 
0 ot None None None None -- 
0 it ICRDYO OCRDYO ICRDY1 OCRDY1_ From communication port 
10 IIOFO NlOF1 lOF2 lOF3 From external pins NIOFO-IlOF3 
11 TIMO TIMO TIMO TIMO From timer TIMO 


T DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used. 
+ This option is not available for DMAO and DMA3 in the ’C44. 


Table 3-4. DMA Channels 2 to 5 (DMA2 to DMA5) Unified Mode Synchronization Interrupts 


Interrupt Enabled at DMA2—-DMA5t 


Bit Value Ss Interrupt Source for DMA 

(in DMA2 to DMA5) DMAx Read DMAx Write Synchronization 
0 0 OF None None -- 
00 18 ICRDY xt OCRDY*xt From communication port 
010 NlOFO IIOFO From external pins IIOFO-IIOF3 
011 NlOF1 ilOF1 
100 NlOF2 IlOF2 
ord IlOF3 IlOF3 
110 TIMO TIMO From timers TIMO and TIM1 
111 TIM1 TIM1 


tT The xin DMAx is the DMA channel number, which is also the number for the corresponding ICRDYx and OCRDY x interrupts. 
For example, an 0019 in both DMA2 READ and DMAS5 WRITE would enable interrupts ICRDY2 and OCRDY5, respectively. 
All other viable bit values (0109 to 1119) are the same (as shown in the table) for DMA2 through DMAS. 

+ DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used. 

§ This option is not available for DMAO and DMA3 in the ’C44. 


—————— se 


Note: DMA Coprocessor Uses Signals to Synchronize 


The interrupts in Table 3-3 and Table 3—4 (ICRDYx, OCRDYx, TIMO, etc.) 
are not vectored. The DMA coprocessor uses these as signals to synchro- 
nize DMA coprocessor transfers. This process is explained in Section 11.10. 


eee 
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3.1.8.2 Split Mode 


Figure 3-5 shows the DMA interrupt enable register for split mode. Table 3-5 
summarizes the interrupt activity for each of the four possible combinations of 
DMAO and DMA‘ for split mode. Table 3-6 summarizes the interrupts enabled 
by three-bit values in DMA2 through DMAS for split mode. 


Figure 3—5. DMA Interrupt Enable Register Bit Functions for DMA Split Mode 


31 30 29 28 27 26 25 24 23 22 21 20 


DMAS Primary Write DMAS5 Auxiliary Read DMA4 Primary Write DMA4 Auxiliary Read 


R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W 
19 18 17 16 15 14 13 12 11 10 9 8 


DMAS3 Primary Write DMAS3 Auxiliary Read DMA2 Primary Write DMA2 Auxiliary Read 


R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W 


7 6 5 4 3 2 1 0 
DMA1 Primary Write DMA1 Auxiliary Read DMAO Primary Write DMAO Auxiliary Read 
R/W R/W R/W R/W R/W R/W R/W R/W 


R = Read W = Write 
Table 3-5. DMA Channels 0 and 1 (DMA0 and DMA7) Split-Mode Synchronization 


Interrupts 
Bit Value Interrupt Enabled at DMAO or DMA1 
(in DMAO DMAO DMAO DMA1 DMA1 
or Auxiliary Primary Auxiliary Primary Interrupt Source for DMA 
DMA‘1) Read Write Read Write Synchronization 
0 ot None None None None -- 
oO 14+ ICRDYO OCRDYO  ICRDY1 OCRDY1 From communication port 
10 IIOFO IlOF1 IlOF2 IlOF3 From external pins IIOFO-IIOF3 
11 TIMO TIMO TIMO TIMO From timer TIMO 


T DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used. 
+ This option is not available for DMAO and DMA3 in the ’C44. 
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Table 3-6. DMA Channels 2 to 5 (DMA2 to DMAS5) Split-Mode Synchronization Interrupts 


Interrupt Enabled at DMA2—-DMA5t 


Bit Value “DMAx Auxiliary DMAxPrimary Interrupt Source for DMA 

(in DMA2 to DMA5) ~——Readt Writet Synchronization 
0 0 of None None SS 
00 18 ICRDYxt OCRDYxt From communication port 
010 IOFO IIOFO From external pins IIOFO-IIOF3 
011 NlOF1 IlOF1 
100 lOF2 IlOF2 
101 IlOF3 IlOF3 
110 TIMO TIMO From timers TIMO and TIM1 
144 TIM1 TIM1 


T The xin DMAx is the DMA channel number, which is also the number for the corresponding ICRDYx and OCRDY x interrupts. 
For example, an 0019 in both DMA2 READ and DMA5 WRITE would enable interrupts ICRDY2 and OCRDY5, respectively. 
All other viable bit values (0109 to 1119) are the same (as shown in the table) for DMA2 through DMAS. 

+ DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used. 

§ This option is not available for DMAO and DMA3 in the ’C44. 


3.1.9 CPU Internal Interrupt Enable Register (IIE) 


The 32-bit internal interrupt enable register, shown in Figure 3-6, enables/dis- 
ables the following interrupts for the CPU: 


Lj Timers 0 and 1 
_j For communication ports 0-5: 


@  Input-buffer full 

@ Input-buffer ready 
Mm Output-buffer ready 
Mm Output-buffer empty 


Lj} DMA coprocessor channels 0-5 


Figure 3-6 shows the IIE register bits. A 1 means the corresponding interrupt 
is enabled; a 0 indicates disabled. At reset, zeros are written to all register bits. 
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Figure 3-6. Internal Interrupt Enable Register (IIE) 


ETINT1 ARTE EDMA | EDMA | EDMA | EDMA | EDMA EOC- EOC- EIC— EIC— 
INT4 INT3 INT2 INT4 INTO EMPTY5 RDY5 RDY5S FULL5 


R/W R/W 


EOC EOC EIC EIC EOC EOC EIC EIC EOC EOC EIC EIC 
EMPTY4 | RDY4 | RDY4 | FULL4 |] EMPTY3.| RDY3 | RDY3 | FULL3 |} EMPTY2 | RDY2 | RDY2 | FULL2 


R/W R/W R/W R/W R/W R/W R/W R/W R/W 


8 7 6 5 4 3 2 1 0 
EOC EOC EIC EIC EOC EOC EIC EIC ETINTO 
EMPTY1 RDY1 RDY1 FULL1 EMPTYO RDYO RDYO FULLO 
R/W R/W R/W R/W R/W R/W R/W R/W R/W 


R = Read, W = Write, R/W = Read/Write 


Notes: 


1) Inthe figure, the shaded boxes are reserved bits in the C44. Zero should 
be written to each of these bits. 


2) The fields corresponding to each unit are separated by double lines. 


Ny 
The following are definitions for each of the bits in the IIE. 

EICFULLx Comm. port x input-buffer full interrupt 

EICRDYx Comm. port x input-buffer ready interrupt 

EOCRDYx Comm. port x output-buffer ready interrupt 

EOCEMPTYx Comm. port x output-buffer empty interrupt 

EDMAINTx DMA coprocessor channel x interrupt 

ETINTO Timer 0 interrupt 

ETINT1 Timer 1 interrupt 


In each field label, the x represents a communication port number (0 — 5) or 
a DMA coprocessor channel number (0-5). For example, a 1 in bit 5 causes 
interrupts to be generated when communication port number 1’s input buffer 
becomes full. Or, a 1 in bit 26 enables channel 1 of the DMA coprocessor to 
respond to interrupts. A 1 enables each interrupt; a 0 disables it. 
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3.1.10 IIOF Flag Register (IIF) 
The IIF register controls the external interrupt pins IIOF(3—0). Use it to specify: 


[1 Which IIOF pins are used for general-purpose I/O and which are used for 
interrupts 


_j Whether a general-purpose pin is input (read only) or output (read/write) 


_j Whether an interrupt pin is for edge-triggered or level-triggered interrupts, 


_j Whether an external interrupt is enabled or disabled 


The IIF register also contains timer, DMA and NMI interrupt flags. Figure 3—7 
shows the IIF register’s bits. The text following the figure explains these bits 
in detail. 


The IIF register bits can be read from or written to under software control. This 
provides access to the IIOFx pins, which can be treated as general-purpose 
I/O or as interrupt pins. For example, if at the IIF register, FUNCx = 0 (I/O pin) 
and TYPEx = 1 (output pin), then by writing into the FLAGx bit, you can also 
write to the external pin IIOFx. If FUNCx = 1 (interrupt pin), writing a 1 to the 
IIF register FLAGx bit has the same effect as an incoming interrupt received 
on the corresponding pin. Consequently, all interrupts can be triggered and/or 
cleared through software. Since the interrupt bits also can be read from, the 
interrupt pins can be polled in software when an interrupt-driven interface is 
not required. 


Internal interrupts operate in a similar manner. In the IIF register, the bit corre- 
sponding to an internal interrupt (e.g., TINTO, TINT1) can be read from and 
written to through software. Writing a 1 sets the interrupt latch, and writing a 
0 clears it. All internal interrupts are one H1/H3 cycle in length. Modify the IIF 
by using logic operations (AND, OR, etc.) as shown: 


correct incorrect 

LDI @MASK, RO LDI IIF, Rl 

AND RO, IIF AND @MASK, R1 
LDI Rl, IIF 


Traps and interrupts are described briefly in Section 3.2, CPU Expansion Reg- 
ister File, on page 3-17, and in detail in Section 7.4, Interrupts, on page 7-15, 
and Section 7.5, Traps, on page 7-24. 
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Figure 3—7. Interrupt Flag Register (IIF) 


31 30 29 28 27 26 25 24 


R/W R/W R/W R/W R/W R/W R/W R/W 
23 22 21 20 19 18 17 16 
R R R R R R R R 
15 14 13 12 11 10 9 8 
TETORS [FLAGS] TYPES | FUNGH JP EIORZ | PLAGE | TYPED | FONG? | 
R/W R/W R/W R/W R/W R/W R/W R/W 
7 6 5 4 3 2 1 0 
PRIOR] AGT TVPET | FONT I IOP | AGO PED] FONG] 
R/W R/W R/W R/W R/W R/W R/W R/W 


R = Read, W = Write, R/W = Read/Write 


FUNCx Mode of pin IIOF x. If FUNCx = 0, pin IIOFx is a general-purpose I/O (R/W) 
pin. lf FUNCx = 1, pin IIOFx is an interrupt pin. 


TYPEx Type of function for pin IIOFx. 
If pin I|OFx is a general-purpose I/O pin (FUNCx = 0): 
TYPEx = 0 makes IIOFx an input pin. 
TYPEx = 1 makes IIOFx an output pin 
If pin IIOFx is an interrupt pin (FUNCx = 1): 
TYPEx = 0 makes IIOFx an edge-triggered latched interrupt, 
TYPEx = 1 makes IIOFx a level-triggered unlatched interrupt. 


FLAGXx Flag for pin I|OFx. 

If pin IIOFx is a general-purpose input pin (FUNCx = 0, TYPEx = 0), 
FLAGx = the value of pin IIOFx and is read only. 

If pin IIOFx is a general-purpose output pin (FUNC x = 0, TYPEx = 1), 
FLAGx = the value on pin IIOFx and is R/W. 

If pin ITOFx is an interrupt pin (FUNCx = 1): 
FLAGx = 0 if interrupt is not asserted. 
FLAGx = 1 if interrupt is asserted. 

If O (zero) is written to FLAGx, the corresponding interrupt is cleared unless 

an interrupt is on the same pin; in that case, the interrupt will remain set. 


ElIFOx Disable/enable external interrupt. 
EIIOFx = 0 disables external interrupts at pin IIOFx. 
EIIOFx = 1 enables external interrupts at pin IIOF x. 


NMI 


Reserved 
TINTO 
TINT1 


DMAINTx 


CPU Primary Register File 


Nonmaskable Interrupt flag (NMI). The NMI interrupt (on the external NMI 
pin) behaves like other interrupts, except that it cannot be masked (disabled) 
by the GIE bit (ST bit 13) or by writing to the NMI bit. It is temporarily masked 
during delayed branches and multicycle CPU operations. At reset, this bit is 
cleared. An asserted interrupt is cleared only by servicing the interrupt. NMI 
is a negative-going, edge-triggered, latched interrupt. It is read-only. 


Reading NMI as 0 indicates that the interrupt is not asserted. 


Reading NMI as 1 indicates that the interrupt is asserted. 
Reserved; read as zeros. 
Timer interrupt flags 0 and 1. 


Reading TINTx as 0 indicates that the timer interrupt is not asserted. 
Reading TINTx as 1 indicates that the timer interrupt is asserted. 

A zero written to this bit clears the interrupt unless the interrupt is asserted 
at the same time; in that case, the interrupt will be shown as asserted. 


Interrupt flag for DMA coprocessor channels 0 to 5. Reading DMAINTx as 
0 indicates that the channel interrupt is not asserted. Reading DMAINTx as 
1 indicates that the channel interrupt is asserted. A zero written to this bit 
clears the interrupt unless the interrupt is asserted at the same time; in that 
case, the interrupt is shown as asserted. 


a a ET | 
Notes: 


1) Shaded IIF bits 0, 1, 2, 3 apply to pin IIOFO; shaded IIF bits 4, 5, 6, 7 apply 
to IIOF1, etc. 


2) The x represents the corresponding IIOF interrupt pin (IIOFO-3) 


ee) 
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3.1.11 Block-Repeat (RS, RE) and Repeat-Count (RC) Registers 


The 32-bit repeat start address register (RS) contains the starting address of 
the block of program memory to be repeated when the CPU is operating in the 
repeat mode. 


The 32-bit repeat end address register (RE) contains the ending address of 
the block of program memory to be repeated when the CPU is operating in the 
repeat mode. 


cca 


Note: 


If RE < RS, the block of program memory is not repeated, and the code does 
not loop backwards. However, the ST(RM) bit remains set to 1. 


| 


The repeat-count register (RC) is a 32-bit register that specifies the number 
of times a block of code is to be repeated when a block repeat is performed. 
If RC contains the number n, the loop is executed n + 1 times. 


3.1.12 Program Counter (PC) 


The program counter (PC) is a 32-bit register containing the address of the 
next instruction to fetch. While the program counter is not part of the CPU reg- 
ister file, it can be modified by the same instructions that modify the program 
flow. 


3.1.13 Reserved Bits and Compatibility 


To retain compatibility with future members of the ’C4x family of microproces- 
sors, reserved bits that are read as zero must be written as zero. Reserved bits 
that have an undefined value must not have their current value modified. In 
other cases, maintain the reserved bits as specified. 
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3.2 CPU Expansion Register File 


This expansion register file contains two special control registers: 


_j Interrupt-vector table pointer (IVTP) 
_j Trap-vector table pointer (TVTP) 


Table 3—7.CPU Expansion Registers 


Register 
Assembler Machine 
Syntax Value (Hex) Function Name 
IVTP 00 Interrupt-vector table pointer. Points to start of the 
interrupt-vector table. 
TVTP 01 Trap-vector table pointer. Points to start of the 


trap-vector table. 


Use the LDEP instruction to load (copy) an expansion register to a primary 
register (e.g., to any of the auxiliary registers ARO—AR7; see Table 3-1 on 
page 3-2). For example: 


LDEP IVTP,AR5 ; IVTP contents to ARS 


Likewise, use the LDPE instruction to load (copy) a primary register to an ex- 
pansion register. Neither of these instructions affects the status register condi- 
tion flags. 


LDPE AR5,IVTP ; AR5 contents to IVTP 


Note that both the interrupt-vector table and the trap-vector table are required 
to lie on a 512-word boundary; thus, the nine least significant bits of these 
pointers are zeros (i.e., 10 0000 0000. = 512 = 200h). Write only zeros to 
these bits (though the register forces these to zeros). 


The 32-bit IVTP register points to (is essentially the base address for) the in- 
terrupt-vector table (IVT) in memory. 


The 32-bit TVTP register is essentially the base address for the trap-vector 
table (TVT) in memory. This table contains the vectors for the TRAP instruc- 
tion’s 512-trap addresses (TRAPO—TRAP511). 


The interrupt and trap vector tables can share the same 512-byte space in 
memory. In this configuration, you can place trap vectors where there are no 
interrupt vectors. For example, since interrupt vector 0O2Ch is unused, you 
could place a trap vector at IVTP + 02Ch (which is also TVTP + 02Ch if the 
tables overlap) and then call that trap by specifying 02Ch in the TRAP instruc- 
tion. 


At reset, IVTP and TVTP are both set to zero. 
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Memory and the Instruction Cache 


The ’C40 accesses a total memory space of 4G 32-bit words (16G bytes) of 
program, data, and I/O space; the ’C44 accesses a total memory space of 32M 
32-bit words (128M bytes). 


Two internal RAM blocks of 1K x 32 bits each (4K bytes) and an internal ROM 
block containing a bootloader permit two accesses per block in a single cycle. 


A 128 x 32-bit instruction cache allows code to be stored off-chip in slower, low- 
er-cost memories without degrading performance. The cache also speeds 
data fetches to the same physical space as the program because it does not 
burden the bus with program instruction fetches. 


This chapter describes the memory maps and the instruction cache. 


Topic Page 
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Memory Map 


4.1 Memory Map 


4-2 


The ’C4x memory space of 4 gigawords (4G x 32 bits where 1G = 230) is shown 
in the memory maps in Figure 4—1 and Figure 4—2. The contents of the first 
segment of address space, at 0000 0000h to OOOF FFFFh, is selected by the 
value of the ROM enable (ROMEN) pin: 


(1 ROMEN = 1. Addresses 0000 0000h—0000 OFFFh are an on-chip ROM 
block (reserved for bootloader operations), and addresses 
0000 1000h—O00F FFFFh are reserved. 


Lj ROMEN = 0. The on-chip (reserved) ROM is disabled, and addresses 
0000 O000h—O00F FFFFh are mapped to the local bus. 


Memory starting at 0010 0000h is not affected by ROMEN. The following is a 
general summary of address ranges: 


[j 0000 0000h—O000F FFFFh: Can be local bus or on-chip (reserved) ROM, 
depending on the value of ROMEN. If ROMENS=0, these addresses are 
mapped to the local bus. If ROMEN=1, these addresses are mapped to 
the on-chip ROM. 


C1 0010 0000h—0010 OOFFh: Internal peripherals (DMA Instructions 

coprocessor, communications ports, timers, etc.). cannot be 
loaded from 

[j} 0010 0100h-—002F F7FFh: Reserved. these 2 areas. 

(_) 002F F800h—002F FBFFh: 1K RAM Block 0. 

[J 002F FCOOh—002F FFFFh: 1K RAM Block 1. 

_j 0030 0000h-—7FFF FFFFh: Local bus. These addresses are mapped to 
the local bus. 

_) 8000 0000h—OFFFF FFFFh: Global bus. These addresses are mapped 


to the global bus. 


CPU data accesses and DMA accesses can be made from any unreserved 
part of the ’C4x memory map. Instruction fetches can take place from any unre- 
served area of the °C4x memory map, except from the peripheral space (ad- 
dresses 0010 0000h—0010 OOFFh). 


Note: 


The ’C4x internal ROM is generally reserved for TI internal use only. Howev- 
er, for high-volume applications, you can request that TI install your code in 
the internal ROM. 


| 


Figure 4—1. C40 Memory Map 


Structure 
depends upon 
ROMEN bit 


Structure identical 


Accessible 
local bus 
(external) 


2G-3M 


Peripherals (internal) 
(see Figure 2-6) 


Reserved 


Reserved 


1K RAM BLK 0 (Internal) 


1K RAM BLK 1 (Internal) 


Local bus 


(external) 


2G 


Global bus 
(external) 


(a) Internal ROM disabled 
(ROMEN = 0) 


Microprocessor Mode 


00000 0000h 


00000 OFFFh 
00000 1000h 


OOOOF FFFFh 
00010 0000h 


00010 OOFFh 
00010 0100h 


0001F FFFFh 
00020 0000h 


0002F F7FFh 
0002F F800h 
0002F FBFFh 
0002F FCOOh 
0002F FFFFh 


00030 0000h 


O7FFF FFFFh 
08000 0000h 


OFFFF FFFFh 


Memory Map 


Bootloader ROM 
(internal) 


Reserved 


Peripherals (internal) 
(see Figure 2-6) 


Reserved 


Reserved 


1K RAM BLK 0 (Internal) 


1K RAM BLK 1 (Internal) 


Local bus 


(external) 


Global bus 
(external) 


(b) Internal ROM enabled 
(ROMEN = 1) 


Microcomputer Mode 
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Figure 4-2. 
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Structure 
depends upon 
ROMEN bit 


Structure identical 


‘C44 Memory Map 


a 


Accessible 
1M local bus 
(external) 


Peripherals (internal) 
(see Figure 2-6) 


Reserved 


Reserved 


1K RAM BLK 0 (internal) 


1K RAM BLK 1 (internal) 


13M Local bus 
(external) 


2G-16M sasel ue 
(alias region) 


16M Global bus 
(external) 


Global bus 
2G (alias region) 


(a) Internal ROM disabled 
(ROMEN = 0) 


Microprocessor Mode 


00000 0000h 


00000 OFFFh 
00000 1000h 


QO00F FFFFh 
00010 0000h 


00010 OOFFh 
00010 0100h 


0001F FFFFh 
00020 0000h 


0002F F7FFh 
0002F F800h 
0002F FBFFh 
0002F FCO0h 
0002F FFFFh 


00030 0000h 


O7FFF FFFFh 
08000 0000h 


OFFFF FFFFh 


Bootloader ROM 
(internal) 


Reserved 


Peripherals (internal) 
(see Figure 2-6) 


Reserved 


Reserved 


1K RAM BLK 0 (internal) 


1K RAM BLK 1 (internal) 


Local bus 
(external) 


Local bus 
(alias region) 


Global bus 
(external) 


Global bus 
(alias region) 


(b) Internal ROM enabled 
(ROMEN = 1) 


Microcomputer Mode 
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4.2 Peripheral Bus Memory Map 


The peripheral bus memory map resides in addresses 0010 0000h— 
0010 OOFFh. Each peripheral requires a 16-word area. Figure 4—3 shows the 
locations of registers for each peripheral in the memory map. 


Figure 4—3. Peripheral Memory Map 


0010 0000h 
0010 000Fh 


0010 0010h 
0010 001Fh 


0010 0020h 
0010 002Fh 


0010 0030h 
0010 003Fh 
0010 0040h 
0010 004Fh 
0010 0050h 
0010 005Fh 


0010 0060h 
0010 O06Fh 


0010 0070h 
0010 007Fh 


0010 0080h 
0010 008Fh 


0010 0090h 
0010 009Fh 


0010 0OAOh 
0010 OOAFh 
0010 OOBOh 
0010 OOBFh 


0010 00COh 
0010 0OOCFh 


0010 00D0h 
0010 OODFh 


0010 OOEOh 
0010 OOEFh 


0010 OOFOh 
0010 OOFFh 


Local and Global Port Control (16 words) 
(See subsection 4.2.1 and Figure 4—4) 


Analysis Module Block Registers (16 words) 
(See subsection 4.2.2) 


Timer 0 Registers (16 words) 

(See subsection 4.2.3 and Figure 4—5) 

Timer 1 Registers (16 words) 

(See subsection 4.2.3 and Figure 4—5) 
Communication Port 0 (16 words) (’C40 only) 
(See subsection 4.2.4 and Figure 4—5) 


Communication Port 1 (16 words) 
(See subsection 4.2.4 and Figure 4—5) 


Communication Port 2 (16 words) 
(See subsection 4.2.4 and Figure 4—5) 
Communication Port 3 (16 words) (C40 only) 
(See subsection 4.2.4 and Figure 4—5) 
Communication Port 4 (16 words) 
(See subsection 4.2.4 and Figure 4—5) 


Communication Port 5 (16 words) 
(See subsection 4.2.4 and Figure 4—5) 


DMA Coprocessor Channel 0 (16 words) 
(See subsection 4.2.5 and Figure 4-6) 
DMA Coprocessor Channel 1 (16 words) 
(See subsection 4.2.5 and Figure 4-6) 


DMA Coprocessor Channel 2 (16 words) 
(See subsection 4.2.5 and Figure 4-6) 
DMA Coprocessor Channel 3 (16 words) 
(See subsection 4.2.5 and Figure 4—6) 


DMA Coprocessor Channel 4 (16 words) 
(See subsection 4.2.5 and Figure 4-6) 


DMA Coprocessor Channel 5 (16 words) 
(See subsection 4.2.5 and Figure 4-6) 
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4.2.1. Local and Global Memory Interface Control Registers 


These registers control the local and global memory interfaces. They occupy 
the first 16-word block of the peripheral bus memory map, shown in 
Figure 4—3. The registers themselves are shown in Figure 4—4. Chapter 9, Ex- 
ternal Bus Operation, covers the operation of these registers. 


These registers define several settings: 


[J The page sizes used for the two strobes of each port 

_j Address ranges over which the strobes are active 

Lj Wait states 

[1 Other similar operations that compose the memory interfaces 


Figure 4—4. Memory Interface Control Registers 


0010 0000h Global Memory Interface Control Register 


0010 0001h 
0010 0003h BneNed 
0010 0004h Local Memory Interface Control Register 
0010 0005h 
Reserved 
0010 000Fh 


4.2.2 Analysis Module Registers 


The second lowest 16-word block in the peripheral bus memory map, as 
shown in Figure 4—3, contains part of the analysis module registers. These 
registers are reserved for emulation functions. The TMS320C 4x C Source De- 
bugger User’s Guide (literature number SPRU054) describes the analysis 
module user interface provided by the ’C4x debugger. 
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4.2.3. Timer Registers 


This group of registers occupies the 0010 0020h—0010 O003Fh range in the 
peripheral bus memory map shown in Figure 4—3, on page 4-5. Timers and 
their registers are covered in detail in Chapter 13, Timers. 


Figure 4—5. Timer Registers 


0010 0020h Timer 0 control register 
0010 0021h 
Reserved 
0010 0023h 
Timer 0 0010 0024h Timer O counter register 
0010 0025h 
Reserved 
0010 0027h 
0010 0028h Timer 0 period register 
Reserved 
0010 0030h Timer 1 control register 
0010 0031h 
Reserved 
0010 0033h 
Timer 1 0010 0034h Timer 1 counter register 
0010 0035h 
Reserved 
0010 0037h 
0010 0038h Timer 1 period register 
Reserved 
0010 003Fh 


Memory and the Instruction Cache 4-7 


Peripheral Bus Memory Map 


4.2.4 Communication Port Memory Map 


Figure 4-6 illustrates the communication-port control registers (CPCR) and 
input and output FIFO buffers. This is the central group of registers in the pe- 
ripheral bus memory map shown in Figure 4—4, on page 4-6. These registers 
are described in more detail in Chapter 12, Communication Ports. 


Figure 4—6. Communication Port Memory Map 


0010 0040h CPCR 0 (’C40 only) 
0010 0041h input port 0, FIFO position 0 
0010 0042h output port 0, FIFO position 7 
0010 0043h Port 0 software reset 
0010 0050h CPCR 1 

0010 005th input port 1, FIFO position 0 
0010 0052h output port 1, FIFO position 7 
0010 0053h Port 1 software reset 
0010 0060h CPCR2 

0010 0061h input port 2, FIFO position 0 
0010 0062h output port 2, FIFO position 7 
0010 0063h Port 2 software reset 
0010 0070h CPCR 3 (’C40 only) 
0010 0071h input port 3, FIFO position 0 
0010 0072h output port 3, FIFO position 7 
0010 0073h Port 3 software reset 
0010 0080h CPCR 4 

0010 0081h input port 4, FIFO position 0 
0010 0082h output port 4, FIFO position 7 
0010 0083h Port 4 software reset 
0010 0090h CPCR5 

0010 0091h input port 5, FIFO position 0 


0010 0092h output port 5, FIFO position 7 
0010 0093h Port 5 software reset 


0010 009Fh 
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4.2.5 DMA Coprocessor Registers 


Peripheral Bus Memory Map 


The DMA registers (shown in Figure 4—7) are the bottom block of registers in 
the peripheral bus memory map (Figure 4-3 on page 4-5). These registers 
are described in Chapter 11, The DMA Coprocessor. 


Figure 4—7. DMA Coprocessor Memory Map 


0010 OOAOh 


0010 OOA8h 
0010 OOA9h 


0010 OOAFh 
0010 OOBOh 


0010 OOB8h 
0010 OOB9h 


0010 OOBFh 
0010 00COh 


0010 00C8h 
0010 00C9h 


0010 OOCFh 
0010 00D0h 


0010 00D8h 
0010 00D9h 


0010 OODFh 
0010 OOEOh 


0010 OOE8h 
0010 OOE9h 


0010 OOEFh 
0010 OOFOh 


0010 OOF8h 
0010 OOF 9h 


0010 OOFFh 


Channel 
registers 
(see exploded 
view) 


DMA Ch 0 


Reserved 


Channel 
registers 
(see exploded 
view) 


Reserved 


Channel 
registers 
(see exploded 
view) 


Reserved 


Channel 
registers 
(see exploded 
view) 


Reserved 


Channel 
registers 
(see exploded 
view) 


Reserved 


Channel 
registers 


(see exploded 
view) 


Reserved 


Exploded View of Each Channel 
Register 


010 00z0h 
010 00z1h 
010 00z2h 
010 00z3h 
010 0024h 
010 00z5h 
010 00z6h 
010 00z7h 
010 00z8h 


Control register x 
Source address x 
Source address index x 
Transfer counter x 


Destination address x 


Destination address index x 
Link pointer x 

Auxiliary transfer counter x 
Auxiliary link pointer x 


xX = channel number (e.g., x = 1 for channel 1, x = 2 for 
channel 2, etc.) 

Z= corresponding hexadecimal digit for channel address 
(e.g., substitute an A for DMA channel 0, B for 
DMA channel 1, etc.) 
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4.3 Instruction Cache 


The 128 x 32-bit instruction cache speeds instruction fetches and lowers sys- 
tem cost. The instruction cache allows the use of slow external memories while 
still achieving single-cycle access performance. The cache also frees the ex- 
ternal buses from program fetches, thus, allowing the use of these buses for 
DMA or other system needs. The cache can operate in a completely automatic 
fashion without the need for external intervention. It uses a form of the LRU 
(least recently used) cache update algorithm. 


4.3.1. Instruction Cache Architecture 


The instruction cache (see Figure 4-9 on page 4-11 ) contains 128 32-bit 
words of RAM, enough to hold 128 words of program memory. It is divided into 
four 32-word segments. Associated with each segment is a 27-bit segment 
start address (SSA) register. For each word in the cache, there is a corre- 
sponding single-bit present (P) flag. 


When the CPU requests an instruction word, a check is made to determine 
whether the word is already in the instruction cache. The partitioning of an in- 
struction address as used by the cache control algorithm is shown in 
Figure 4—8. The 27 most significant bits (MSBs) of the instruction address se- 
lect the segment, and the five least significant bits (LSBs) define the address 
of the instruction word within the pertinent segment. The 27 MSBs of the in- 
struction address are compared with the four SSA registers. If a match is 
found, the relevant P flag is checked. The P flag indicates whether the word 
within a particular segment is already present in cache memory: 


[j P=1:the word is already present in cache memory. 
Lj P=0: location in cache is invalid (e.g., contains garbage). 


Figure 4—8. Address Partitioning for Cache Control Algorithm 
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31 5 4 0 
Segment start address Instruction word 
(SSA) address within segment 


If there is no match, one of the segments must be replaced by the new data. 
The segment replaced in this circumstance is determined by the LRU (least 
recently used) algorithm. The LRU stack (see the upper-right portion of 
Figure 4—9) is maintained for this purpose. 


Figure 4—9. Instruction Cache Architecture 


Segment start P 
address registers Flags 


SSA Register 0 0 Segment word 0 
1 Segment word 1 


— 27 bits —__4 
. mt le 1 bit 
30 Segment word 30 
31 Segment word 31 
|¢— 32 bits —>| 
0 Segment word 0 
1 Segment word 1 
30 Segment word 30 
31 Segment word 31 
0 Segment word 0 
1 Segment word 1 
30 Segment word 30 
31 Segment word 31 
0 Segment word 0 
1 Segment word 1 
30 Segment word 30 
31 Segment word 31 


Segment Words 


SSA Register 1 


SSA Register 2 


SSA Register 3 


Memory and the Instruction Cache 


Instruction Cache 


LRU 

Stack 
Most recently 
usedsegment 
number 


Least recently 
used segment 


— > bits k«— number 


Segment 0 


Segment 1 


Segment 2 


Segment 3 
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Instruction Cache 


The LRU stack keeps track of which segment (0-3) qualifies as the least re- 
cently used after each access to the cache. Each time a segment is accessed, 
its segment number is removed from the LRU stack and pushed onto the top 
of the LRU stack. Therefore, the number at the top of the stack is the most re- 
cently used segment number, and the number at the bottom of the stack is the 
least recently used segment number. 


At reset, the following occur in the instruction cache: 


[J Cache is disabled (ST(CE) = 0). After reset cache is frozen (ST(CF) = 1). 
See section 3.1.7, Status Register (ST), on page 3-5, for details. 


_j All P flags are set to zero. 


Lj The LRU stack is initialized with segment 0 at the top, followed by seg- 
ments 1, 2, and 3 at the bottom. If any two SSA registers are equal (due 
to reset conditions) and a cache hit occurs, the instruction word is fetched 
from the most recently used segment. 


When areplacement is necessary, the least recently used segment is selected 
for replacement. Also, the 32 P flags for the segment to be replaced are set 
to 0, and the segment’s SSA register is replaced with the 27 MSBs of the new 
instruction’s address. 


4.3.2 Cache Control Bits 
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Four cache control bits are located in the CPU status register (ST): the cache 
clear bit (CC), the cache enable bit (CE), the cache freeze bit (CF), and the 
previous cache freeze bit (PCF). The status register is shown in Figure 3-3. 


Cache Clear Bit (CC). Set CC = 1 to invalidate all entries in the cache. This 
bit is always cleared after it is written to; thus, it is always read as 0. At reset, 
0 is written to this bit. The cache P flag = 0 when the cache is cleared. 


Cache Enable Bit (CE). Set CE = 1 to enable the cache, allowing the cache 
to be used according to the LRU (least recently used) cache algorithm. Set 
CE = 0 to disable the cache; this prevents cache updates or modifications 
(thus, no cache fetches can be made). At reset, 0 is written to this bit. Cache 
clearing (CC = 1) is allowed when CE = 0. 


Cache Freeze Bit (CF). Set CF = 1 to freeze the cache including freezing of 
LRU (least recently used) stack manipulation. If the cache is enabled (CE = 
1) and the cache is frozen (CF = 1), fetches from the cache are allowed, but 
modification of the cache contents is not allowed. Cache clearing (CC = 1) is 
allowed when CF = 1. At reset, this bit is cleared to 0 and after reset it is set 
to 1. When CF = 0, cache clearing (CC=1) is allowed. CF is set to one when 
a trap or interrupt is taken. Also, the RETI and RETID instructions copy PCF 
to the CF bit. 


Instruction Cache 


Table 4-1 summarizes the effects of the CE and CF bits. 


Table 4—1. Combined Effect of the CE and CF Bits 


CE CF Effect 
0 0 Cache not enabled 


0 1 Cache not enabled 


Cache enabled and not 
frozen 


1 1 Cache enabled and frozen 


Previous Cache Freeze Bit (PCF). When an interrupt or trap vector is taken, 
the CF value is copied to the PCF bit, and the CF bit is set to 1. This protects 
the cache during interrupt processing and is particularly useful when code 
loops are interrupted. The interrupt service routine may optionally use the 
cache under software control. Interrupts may also be nested, providing that the 
status register is saved before the interrupts are enabled. When the instruc- 
tions RETIcond and RETIconaD are executed to complete interrupt process- 
ing, the contents of the PCF bit are copied to the CF bit. 


4.3.3 Using the Cache 


Only instructions may be fetched from the program cache. All reads and writes 
of data to and from memory, bypass the cache. Program fetches from internal 
memory do not modify the cache and do not generate cache hits or misses. 
The program cache is a single-access memory block. Dummy program 
fetches (i.e., following a branch) can generate cache misses and cache up- 
dates. Example 4—1 shows a typical way to clear and enable the cache. 


Example 4—1.Enabling the Cache 
OR 1800h, ST 
To use the cache more efficiently, take two precautions: 


Avoid using self-modifying code. If an instruction resides in the cache and 
the corresponding location in primary memory is modified, the copy in the in- 
struction in the cache is not modified. 


Align program code. Use the .align directive when coding assembly lan- 
guage to align code on 32-word address boundaries. 
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Instruction Cache 


4.3.4 The LRU Cache Algorithm 
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When the ’C4x requests an instruction word from external memory, the two 
possible actions are a cache hit or a cache miss: 


a 


Cache Hit. The cache contains the requested instruction, and the follow- 
ing actions occur: 


m The instruction word is read from the cache. 


m The number of the segment containing the word is removed from the 
LRU stack and pushed to the top of the LRU stack (if itis not already at 
the top), thus moving the other segment numbers toward the bottom of 
the stack. 


Cache Miss. The cache does not contain the instruction. There are two 
types of cache misses: 


m Subsegment miss. The segment address register matches the in- 
struction address, but the relevant P flag is not set. The following ac- 
tions occur: 


The instruction word is read from memory and copied into the 
cache. 


The number of the segment containing the word is removed from 
the LRU stack and pushed to the top of the LRU stack (if it is not 
already at the top), thus moving the other segment numbers to- 
ward the bottom of the stack. 


The relevant P flag is set. 


m Segment miss. None of the segment addresses matches the instruc- 
tion address. The following actions occur: 


The least recently used segment is selected for replacement and 
the P flags for all 32 words are cleared. 


The SSA register for the selected segment is loaded with the 27 
MSBs of the address of the requested instruction word. 


The instruction word is fetched and copied into the cache. It goes 
into the appropriate word of the least recently used segment. The 
P flag for that word is set to 1. 


The number of the segment containing the instruction word is re- 
moved from the LRU stack and pushed to the top of the LRU 
stack, thus moving the other segment numbers toward the bottom 
of the stack. 


Chapter 5 


Data Formats and Floating-Point Operation 


In the ’C4x architecture, data is organized into three fundamental types: inte- 
ger, unsigned-integer, and floating-point. Note that the terms, integer and 
signed-integer, are considered to be equivalent. The ’C4x supports short and 
single-precision formats for signed and unsigned integers. It also supports 
short, single-precision and extended-precision formats for floating-point data. 


Floating-point operations make fast, trouble-free, accurate, and precise com- 
putations. Specifically, the ’C4x implementation of floating- point arithmetic fa- 
cilitates floating-point operations at integer speeds while preventing problems 
with overflow, operand alignment, and other burdensome tasks common in in- 
teger operations. 


This chapter discusses in detail the data formats and floating-point operations 
supported on the ’C4x. 
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Signed-Integer Formats 


5.1 Signed-Integer Formats 


The ’C4x supports two signed-integer formats: a 16-bit short format and a 
32-bit single-precision format. The term integeris used throughout this chapter 
to refer to a signed integer. 


TS 


Note: 


When extended-precision registers are used as integer operands, only bits 
31-0 are used; bits 39-32 remain unchanged and unused. 


| 


5.1.1. Short Integer Format 


The 16-bit twos-complement short integer format is used for immediate integer 
operands. For those instructions that assume integer operands, this format is 
sign extended to 32 bits (see Figure 5-1). The range of an integer si, 
represented in the short integer format, is: 


-215 < si< 215-1 
In Figure 5—1 and other figures in this chapter, s = sign bit. 


Figure 5—1. Short-Integer Format and Sign Extension of Short Integer 
15 


0 
ieee 


(a) Short integer format 
31 16 15 


SIEGE SHESIESIEG OME SESE SMSC SEESEESEES Short integer 


(b) Sign extension of a short integer format 


fo} 


5.1.2 Single-Precision Integer Format 


In the single-precision integer format, the integer is represented in twos-com- 
plement notation. The range of an integer sp, represented in the single-preci- 
sion integer format, is —231 < sp< 231 -1. Figure 5-2 shows the single-preci- 
sion integer format. 
Figure 5—2. Single-Precision Integer Format 
31 0 
SS SS 
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Unsigned-Integer Formats 


5.2 Unsigned-Integer Formats 


Two unsigned-integer formats are supported on the ’C4x: a 16-bit short format 
and a 32-bit single-precision format. In this chapter, the term unsigned integer 
is used to refer to an unsigned integer. 


a a ET | 


Note: 


When extended-precision registers are used, the unsigned-integer oper- 
ands use only bits 31—0; bits 39-32 remain unchanged. 


| ee) 


5.2.1. Short Unsigned-Integer Format 


Figure 5-3 shows the16-bit short unsigned-integer format used in immediate 
unsigned-integer operands. For instructions that use unsigned-integer oper- 
ands, the format is filled with zeros to 32 bits. The range of a short unsigned 
integer is O<si<216, 

Figure 5—3. Short Unsigned-Integer Format and Zero Fill 


15 0 


ners 


(a) Short unsigned-integer format 
31 16 15 0 


0o000000000000000 Short Unsigned Integer 


(b) Zero fill of a short unsigned-integer format 


5.2.2 Single-Precision Unsigned-Integer Format 


In the single-precision unsigned-integer format, the number is represented as 
a 32-bit value, as shown in Figure 5-4. The range of a single-precision un- 
signed-integer is 0<sp<282. 


Figure 5—4. Single-Precision Unsignead-Integer Format 
31 0 


Po 
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Floating-Point Formats 


5.3 Floating-Point Formats 


The ’C4x supports three floating-point formats: 


Li Ashort floating-point format (for immediate floating-point operands) con- 
sisting of a 4-bit exponent, one sign bit, and an 11-bit fraction 


Lj Asingle-precision format consisting of an 8-bit exponent, one sign bit, and 
a 23-bit fraction 


[j An extended-precision format consisting of an 8-bit exponent, one sign bit, 
and a 31-bit fraction 


All ’C4x floating-point formats consist of three fields: an exponent field (e), a 
single-bit sign field (s), and a fraction field (t). The sign field and fraction field 
may be considered as one unit and referred to as the mantissa field (man). 
Each format is divided into these fields as shown in Figure 5—5. 


Figure 5—5. General Floating-Point Format 


4 Mantissa > 


The general equation for calculating the value in a floating point number is giv- 
en by Equation 5-1. In the equation, sis the value of the sign bit, s is the in- 
verse of the value of the sign bit, fis the binary value of the fraction field, and 
eis the decimal equivalent of the exponent field. 


Equation 5-1. Value in a Floating Point Number 
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X= 88.fo X 20 


The mantissa represents a normalized twos-complement number. In a nor- 
malized representation, a most significant nonsign bit is implied, thus provid- 
ing an additional bit of precision. The implied sign bit is used as follows: 


1 Ifs =O, then the leading two bits of the mantissa are 01. 
.j Ifs=1, then the leading two bits of the mantissa are 10. 


If the sign bit, s, is equal to 0, the mantissa becomes 01.fs, where fis the binary 
representation of the fraction field. If sis 1, the mantissa becomes 10.fo, where 
fis the binary representation of the fraction field. 


For example, if f= 00000000001» and s = 0, the value of the mantissa (man) 
would be 01.00000000001>. If s = 1 for the same value of f, the value of man 
would be 10.00000000001>. 


Floating-Point Formats 


The exponent field is a twos-complement number that determines the factor 
of two by which the number is multiplied. Essentially, the exponent field shifts 
the binary point in the mantissa. If the exponent is positive, then the binary 
point is shifted to the right. If the exponent is negative, then the binary point 
is shifted to the left. 


For example, if man=01.00000000001>5 and the e= 1149, then the binary point 
is shifted eleven places to the right, producing the number: 01000000000015, 
which is equal to 2049 decimal. 


5.3.1. Short Floating-Point Format 


In the short floating-point format, floating-point numbers are represented by 
a twos-complement 4-bit exponent field (e) and a twos-complement 12-bit 
mantissa field (man) with an implied most significant nonsign bit. 


Figure 5—6. Short Floating-Point Format 
fos: te? “We 40 0 


Expo- Sign Fraction 
nent 


+——Y— _ Mantissa —————— 


You must use the following reserved values to represent zero in the single-pre- 
cision floating-point format: 


e=-8 
s=0 
f=0 


Operations are performed with an implied binary point between bits 11 and 10. 
The floating-point twos-complement number x in the short floating-point for- 
mat is given by: 


X= 01.fo x 28 ifs=0 
X= 10.fo x 28 ifs=1 
x=0 ife=—8,s=0, f=0 


The following examples illustrate the range and precision of the short float- 
ing-point format: 


Most Positive: X= (2-27-11) x 27 = 2.5594 x 102 
Least Positive: xX=1x2-7=7.8125 x 10-3 

Least Negative: X = (-1-2-11) x 2-7 = -7.8163 x 10-3 
Most Negative: X =-2 x 27 = 2.5600 x 102 
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Floating-Point Formats 


5.3.2 Single-Precision Floating-Point Format 


In the single-precision format, the floating-point number is represented by an 
8-bit exponent field (e) and a twos-complement 24-bit mantissa field (man) 
with an implied most significant nonsign bit. 


Operations are performed with an implied binary point between bits 23 and 22. 
When the implied most significant nonsign bit is made explicit, it is located to 
the immediate left of the binary point. The floating-point number x is given by 
x= 01.fx 2e ifs=0 

X= 10.fx 2€ ifs=1 

x=0 ife=-128,s=0,f=0 


Figure 5—7. Single-Precision Floating-Point Format 
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31 24 23 22 0 
4 Mantissa > 


You must use the following reserved values to represent zero in the single-pre- 
cision floating-point format: 


e=-128 

s=0 

f=0 

The following examples illustrate the range and precision of the single-preci- 
sion floating-point format. 

Most Positive: X = (2—2-23) x 2127 = 3.4028234 x1038 

= 1x 2-127 - 5 8774717 x 10-99 

(-1-2-23) x 2-127 = —5.8774724 x10-39 

= —2x 2127 = —3.4028236 x1038 


Least Positive: 


x 
Least Negative: x 
Most Negative: x 


Floating-Point Formats 


5.3.3 Extended-Precision Floating-Point Format 


In the extended-precision format, the floating-point number is represented by 
an 8-bit exponent field (e) and a 32-bit mantissa field (man) with an implied 
most significant nonsign bit. 


Operations are performed with an implied binary point between bits 31 and 30. 
When the implied most significant nonsign bit is made explicit, it is located to 
the immediate left of the binary point. The floating-point number x is given by: 


X = 01.fx 2¢ ifs=0 
X =10.fx 2¢ ifs=1 
xXx=0 if e=-128, s=0, f=0 


Figure 5—8. Extended-Precision Floating-Point Format 


39 32 31 30 0 
eT Mantissa > 


You must use the following reserved values to represent zero in the exten- 
ded-precision floating-point format: 


e=-128 

s=0 

f=0 

The following examples illustrate the range and precision of the extended-pre- 
cision floating-point format: 

Most Positive: xX = (2-27-31) x2127 = 3.4028236683 x1038 

= 1x 2-127 = 58774717541 x 10-89 

= (1-27-31) x 2-127 = —5 8774717569 x 10-39 

= —2x 2127 — —3.4028236691 x 1038 


Least Positive: 
Least Negative: 


<x &< &X 


Most Negative: 
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5.3.4 Determining the Decimal Equivalent of a Floating-Point Number 


There are two basic steps in determining the value stored in floating point for- 
mat: 


1) Determine the values of the exponent and mantissa. 


2) Shift the binary point in the mantissa according to the value of the expo- 
nent field and then convert the number to decimal. 


5.3.4.1. Step 1: Determine the Values of the Exponent and Mantissa 
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The exponent field is a twos-complement number whose range depends on 
the type of floating-point number you are converting. Record the decimal 
equivalent of this value as e. 


For example, if you are converting a single-precision floating-point number 
and the binary value of the exponent field is 00000100, then the decimal value 
of the exponent would be 4 since a1 in the third bit from the right corresponds 
to 4. 


If, on the other hand, the binary value of the exponent field is 111111005, then 
the decimal value of the exponent would be —4. Since the first bit on the left 
is 1, you know that the number is negative. You calculate the value of the num- 
ber by taking the one’s complement of 111111005, which is 000000115 and 
then by adding 1 to that result. 


eS _.. 0 er 
Note: 


If the value of the exponent matches the value reserved for zero, then the 
floating point number is equal to zero. The reserved value for each floating 
point type is given with the type descriptions in Section 5.3. 


The mantissa is a binary number with an implied binary point between the sign 
bit and the fraction field. Form the mantissa in one of two ways: 


(7 +Ifs=0, form the mantissa by writing 01. and appending the bits in the frac- 
tion field after the binary point. 


For example, if f= 101000000002, then man = 01.10100000000>: 


Fraction 


S) 
a Ee ee Ee 


Rewrite the mantissa as: 


Mantissa 


eS 


Floating-Point Formats 


C1 Ifs=1, form the mantissa by writing 10. and appending the bits in the frac- 
tion field after the binary point. 


For example, if f= 101000000002, then man = 10.10100000000p. 


Fraction 


S 
Ee ERs re ee ee 


Rewrite the mantissa as: 


Mantissa 


ERES ce Re ee ee ee 


5.3.4.2 Step 2: Shift the Decimal Point in the Mantissa and Convert to Decimal 


If the exponent (e) has a positive value, then you shift the binary point e places 
to the right. 


If the exponent (e) has a negative value, then you shift the binary point eplaces 
to the left. 


For example, if e= 249 and the man=01.11000000000s», then the shifted man- 
tissa becomes 0111.000000000p, which is equivalent to 7 in decimal. 


If, on the other hand, e = —219 and man =01.10000000000z, then the shifted 
mantissa becomes .0110000000000s, which is equivalent to 3/8 in decimal. 


The following examples illustrate how you can obtain the equivalent floating- 
point value of a number in ’C4x floating-point format. Each of the examples 
uses the single-precision floating point format. 


Example 5—1. Positive Number 


0 2 4 0 0 0 0 0 Hex value 
0000 0010 0100 0000 0000 0000 0000 0000 Binary value 
Exponent = 0000 00105 = 2 
Sign = 0 
Fraction = - 100005 
Value = Pi te 22S 01109. = % 

a Fraction 
Implied 


Sign 
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Example 5—2. Negative Number 


0 1 C 0 0 0 0 0 
0000 0001 1100 0000 0000 0000 0000 0000 


Hex value 
Binary value 


Exponent = 0000 00015 = 1 
Sign = 1 
Fraction = - 100005 
Value = DOs *% 21] 1015. S43 
——— Fraction 
Implied 
Sign 
Example 5—3. Fractional Number 
F B 4 0 0 0 0 0 Hex value 
1111 1011 0100 0000 0000 0000 0000 0000 Binary value 
Exponent = 1111 10115 = -5 
Sign = 0 275 
Fraction = . 100005 |-— van 
Value = 01.15 xX 2% = .0000115 = 3/64 
—— Fraction 
Implied 
Sign 
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5.3.5 Conversion Between Floating-Point Formats 


Floating-point operations assume several different formats for inputs and out- 
puts. These formats often require conversion from one floating-point format to 
another (for example, from short floating-point format to extended-precision 
floating-point format). Format conversions occur automatically in hardware, 
with no overhead, as a part of floating-point operations. Examples of the four 
conversions are shown in Figure 5—9 through Figure 5—12 (s = sign bit of the 
exponent). When a floating-point format zero is converted to a different format, 
it is always converted to a valid representation of zero in that format. 


Figure 5—9. Short Floating-Point Format Conversion to Single-Precision Floating-Point 
Format 


15 12 11 10 0 


(a) Short floating-point format 


31 27 24 23 22 12 11 0 
CRESS SES CC 
(b) Single-precision floating-point format 


In converting from short format to single-precision format, the exponent field is 
sign extended and the rightmost 12 bits of the fraction field are filled with zeros. 


Figure 5—10. Short Floating-Point Format Conversion to Extended-Precision 
Floating-Point Format 


(a) Short floating-point format 
39 35 32 31 30 20 19 0 


(b) Extended-precision floating-point format 


In converting from short format to extended-precision format, the exponent 
field is sign extended and the rightmost 20 bits of the fraction field are filled with 
zeros. 


4 
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Figure 5-11. Single-Precision Floating-Point Format Conversion to Extended-Precision 
Floating-Point Format 


31 24 23 22 0 


(a) Single-precision floating-piont format 


8 7 0 
ES 2 
(b) Extended-precision floating-point format 


In converting from single-precision format to extended-precision format, the 
rightmost eight bits of the fraction field are filled with zeros. 


Figure 5-12. Extended-Precision Floating-Point Format Conversion to Single-Precision 
Floating-Point Format 


39 32 31 30 


(a) Extended-precision floating-point format 


31 24 23 22 


(b) Single-precision floating-point format 


In converting from extended-precision format to single-precision format, the 
eight rightmost bits of the fraction field are truncated. 


Floating-Point Conversion (IEEE Std. 754) 


5.4 Floating-Point Conversion (IEEE Std. 754) 


The ’C4x floating-point format is not compatible with the IEEE standard 754 
format. However, the ’C4x has instructions to directly convert to and from IEEE 
format (TOIEEE and FRIEEE, respectively). The conversion process is ex- 
plained in subsections 5.4.1 and 5.4.2. Figure 5—13 shows the IEEE floating- 
point format, and Figure 5—14 shows the floating-point ’C4x format. 


Figure 5—13. IEEE Single-Precision Std. 754 Floating-Point Format 


31 30 23 | 22 0 
Le EE eae 
lq man >| 


The following five cases define the value v of a number expressed in the IEEE 


format: 

1) If e=255 and f#0, then v=NaN 

2) If e=255 and f=0, then v = (—1)s infinite 

3) If O<e<255, then v= (-1)§ x 2@-127(1.f) 
4) lf e=0 and f+#0, then v = (-1)§ x 2-126(0.f) 
5) lf e=0 and f=0, then v= (-1)$x 0 (zero). 


where s = sign bit; e = the exponent field; f= the fraction field; NaN = Not a number 


For the above five representations, eis treated as an unsigned integer. Case 
1 generates NaN (not an number) and is primarily used for software signaling. 
Case 4 represents a denormalized number. Case 5 represents positive and 
negative zero. 


Figure 5—14. ’C4x Single-Precision Twos-Complement Floating-Point Formatt 


+ Same format as for the 'C3x 


In comparison, Figure 5-14 shows the the ’C4x twos-complement floating- 
point format. In this format, two cases can be used to define value v of a num- 


ber: 
1) lf =-128 and f+0, then v=0 
2) If e#-128 then v= ss.fo x 2 


where s = sign bit; e = the exponent field; f= the fraction field. 
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Floating-Point Conversion (IEEE Std. 754) 


For this representation, e is treated as a twos-complement integer. The frac- 
tion and sign bit form a normalized twos-complement mantissa. 


+66. -4&sa8 :..— 45%... .. = wy nase oe wee ts =  ~ of 


Note: Differentiating Symbols for IEEE and ’C4x Formats 


To differentiate between the symbols that define these two formats, all IEEE 
fields are subscripted with an IEEE (e.g., e)EEE, S|EEE, etc.). Similarly, all 
twos-complement fields are subscripted with two (i.e., two, Stwo: fwo): 


5.4.1 Converting IEEE Format to Twos-Complement ’C4x Floating-Point Format 


The most common conversion is the IEEE-to-twos-complement format. This 
conversion is done according to rules in the following table: 


Table 5-1. Converting IEEE Format to Twos-Complement Floating-Point Format 


If These Values Are Present Then These Values Equal 
Case eEEE SIEEE flEEE Ctwo Stwo __ ftwo SIEEE 
1 255 1 7Fh 1 00 0000h 
2 255 0 7Fh 0 7F FFFFh 
3 O< eyEER <255 0 eIEEE-7Fh flEEE 0 
4. 0< e]E EE <255 1 40 @|EEE-7Fh fieeetit 1 
5 0< eyeeR <255 1 0 e1EEE-80h 0 1 
6 0 80h 0 00 0000h 


tT ¥/EEE = ones complement of f/EEE. 


Floating-Point Conversion (IEEE Std. 754) 


Case 1 maps the IEEE positive NaNs and positive infinity to the single-preci- 
sion twos-complement most positive number. Overflow is also signaled to al- 
low you to check for these special cases. 


Case 2 maps the IEEE negative NaNs and negative infinity to the single- 
precision twos-complement most negative number. Overflow is also signaled 
to allow you to check for these special cases. 


Case 3 maps the IEEE positive normalized numbers to the identical value in 
the twos-complement positive number. 


Case 4 maps the IEEE negative normalized numbers with a nonzero fraction 
to the identical value in the twos-complement negative number. 


Case 5 maps the IEEE negative normalized numbers with a zero fraction to 
the identical value in the twos-complement negative number. 


Case 6 maps the IEEE positive and negative denormalized numbers and posi- 
tive and negative zeros to a twos-complement zero. 


The ’C4x assumes that an IEEE number is stored as an integer in memory or 
in a register. When the ’C4x converts an IEEE number, it places the number 
in an extended-precision register by using the exponent and fraction fields of 
the register. The eight LSBs of the extended-precision register are set to zero. 
Any arithmetic operations that are performed on the fraction field of the IEEE 
number should be performed only on the IEEE fraction field. In the case of a 
block memory transfer, a no-penalty data format conversion can be executed 
by using parallel instructions with STF. Example 5-4 illustrates how this can 
be accomplished. 
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Example 5—4. IEEE to ’C4x Conversion Within Block Memory Transfer 


5-16 


+ FF F FF F FF F F 


AT INPUT FIFO OF COMMUNICATION PORT 0 


TITLE IEEE TO ’C4x CONVERSION WITHIN BLOCK MEMORY 
TRANSFER 

PROGRAM ASSUMES TH 

IS FULL OF IEEE FORMAT DATA. EIGHT DATA WORDS ARE 
TRANSFERRED FROM COMMUNICATION PORT 0 TO INTERNAL RAM 
BLOCK 0 AND THE DATA FORMAT IS CONVERTED FROM IEEE FORMAT 


TO ’C4x FLOATING-POINT FORMAT. 


@CPO_IN,ARO ;Load comm portO input FIFO address 
LDI @RAMO, AR1 ;Load internal RAM block 0 address 
FRIEEE *ARO, RO ;Convert first data 
RPTS 6 
FRIEEE *ARO, RO ;Convert next data 
STE RO, *AR1++(1) ;Store previous data 
S RO, *AR1++(1) ;Store last data 


Floating-Point Conversion (IEEE Std. 754) 


5.4.2 Converting Twos-Complement ’C4x Floating-Point Format to IEEE Format 


This conversion is performed according to the following table: 


Table 5-2. Converting Twos-Complement Floating-Point Format to IEEE Format 


Case 
1 


2 
3 
4 
5 
6 


If These Values Are Present Then These Values Equal 
Stwo fiwo C\EEE SIEEE _—‘fIEEE 
00h 0 00 0000h 
00h 0 00 0000h 
1265 eyo $127 0 Cwot7Fh 0 few 
1265 eyo $127 1 #0 Cwot7Fh 0 Frwottt 
1265 Co $127 1 0 Grwo+80h 1 00 0000h 
1 0 FFh 1 00 0000h 


t fwo = ones complement of fo. 


Case 1 maps a twos-complement Zero to a positive IEEE zero. 


Case 2 maps the twos-complement numbers that are too small to be repre- 
sented as normalized IEEE numbers to a positive IEEE zero. 


Case 3 maps the positive twos-complement numbers that are not covered by 
case 2 into the identically valued IEEE number. 


Case 4 maps the negative twos-complement numbers with a nonzero fraction 
that are not covered in case 2 into the identically valued IEEE number. 


Case 5 maps all the negative twos-complement numbers with a zero fraction, 
except for the most negative twos-complement number and those that are not 
covered in case 2, into the identically valued IEEE number. 


Case 6 maps the most negative twos-complement number to the IEEE nega- 
tive infinity. 


The ’C4x assumes that the twos-complement numbers are in memory or are 
in an extended-precision register in the exponent and fraction field of the regis- 
ter (shown in Figure 5-14 on page 5-13). If the value is in an extended-preci- 
sion register, then only the 24 MSBs of the fraction field are manipulated as 
the fraction field and for detection of the special cases. The result of the con- 
version goes into the 32 MSBs of an extended-precision register. In the case 
of a block memory transfer, a no-penalty data format conversion can be 
executed by using parallel instructions with STF. Example 5-5 illustrates how 
this can be accomplished. 
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Example 5-5. ’C4x to IEEE Conversion Within Block Memory Transfer 


ITLE ‘'C4x TO IEEE CONVERSION WITHIN BLOCK MEMORY 
RANSFER 


PROGRAM ASSUMES THAT OUTPUT FIFO OF COMMUNICATION PORT 0 
IS EMPTY. EIGHT DATA WORDS ARE TRANSFERRED FROM 
INTERNAL RAM BLOCK 0 TO COMMUNICATION PORT 0 AND THE 
DATA FORMAT IS CONVERTED FROM ’C4x FLOATING-POINT FORMAT 
TO IEEE FORMAT. 


+ + + F F F F HF 


LDI @CPO_OUT,ARO ;Load comm portO output FIFO 
; address 

LDI @RAMO, AR1L ;Load internal RAM block 0 
; address 


OIEEE *AR1++(1),RO ;Convert first data 
RPTS 6 
OIEEE *AR1++(1),RO ;Convert next data 

| | STF RO, *ARO ;Store previous data 
STF RO, *ARO ,Store last data 
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5.5 Floating-Point Multiplication 


A floating-point number a can be written in floating-point format as in the fol- 
lowing formula, where «(man) is the mantissa and a(exp) is the exponent: 


= aman) x 20(exp) 


The product of a and b is c, defined as: 

c = a x b = aman) x b(man) x 2(a4(exp)+b (exp) 
Thus: 

c(man) = a(man) x b(man) 

c(exp) = (exp) + b(exp) 


During floating-point multiplication, the source operands are always in the ex- 
tended-precision floating-point format. If the source operands are in short or 
single-precision format, they are converted to extended-precision format. 
These conversions occur automatically in hardware with no overhead. All re- 
sults of floating-point multiplications are returned in the extended-precision 
format. 


A multiplication occurs in a single cycle. 


Figure 5—15 is a flowchart showing the steps involved in a floating-point multi- 
plication. Each step is labelled with a number in parentheses. 


Lj Instep 1, the 32-bit source mantissas, «(man) and b(man), are multiplied, 
producing a 64-bit result, c(man). (Note that input and output data are al- 
ways represented as normalized numbers.) 


uu 


In step 2, the exponents, a(exp) and b(exp), are added, yielding c(exp). 


[1 Step 3 checks whether or not c(man) is equal to zero. If c(man) is zero, 
step 7 sets c(exp) to —128, thus yielding the representation for zero. 


Steps 4 and 5 normalize the result. 


uu 


Lj lIfaright shift of one is necessary, then in step 8, c(man) is right-shifted one 
bit, and 1 is added to c(exp). 


Lj Ifaright shift of two is necessary, then in step 9, c(man) is right-shifted two 
bits, and 2 is added to c(exp). step 6 occurs when the result is normalized. 


Lj Instep 10, c(man) is set in the extended-precision floating-point format. 


Steps 11 through 16 check for special cases of c(exp). 


uu 


Li Instep 14, if c(exp) has overflowed (detected in step 11) in the positive di- 
rection, then c(exp) is set to the most positive extended-precision format 
value. If c(exp) has overflowed in the negative direction, then c(exp) is set 
to the most negative extended-precision format value. 
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(1 Ifc(exp) has underflowed (detected in step 12), then cis set to zero in step 
15; i.e., c(man) = 0 and c(exp) = -128. 
Figure 5-15. Flowchart for Floating-Point Multiplication 


o(man) b(man) a(exp) b(exp) 


() 


Multiply mantissas Add exponents 


c(man) = o(man) x b(man) c(exp) = a(exp) + b(exp) 
(50-bit result) 


Test for special cases of c(man) 


(4) (5) (6) 
Right- shift 1 Right- shift 2 No shift 
to normalize to normalize to normalize 


c(man) >> 1 c(man) > > 2 


c(exp) = c(exp) + 1 c(exp) = c(exp) + 2 


Dispose of extra bits 


Put c(man) in extended-precision 
floating-point format 


Test for special cases of c(exp) 


(11) (12) (13) 
c(exp) overflow c(exp) underflow c(exp) in range 


If c(man) > 0, set c to 
most positive value. 


If c(man) < 0, set c to 
most negative value. 


Set c to final result 


c=axb 
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Example 5-6 through Example 5~9 illustrate how floating-point multiplication 
is performed on the ’C4x. For these examples, the implied most significant 
nonsign bit is made explicit. 


Example 5-6. Floating-Point Multiply (Both Mantissas = —2.0) 
Let 
a = —2.0 x 20(€XP) = 10.00000000000000000000000 x 2«( exp) 
b = -2.0 x 2b(exP) = 10.00000000000000000000000 x 2b(exp) 


where a and b are both represented in binary form according to the normalized 
single-precision floating-point format. 


To place this number in the proper normalized format, it is necessary to shift 
the mantissa two places to the right and add 2 to the exponent. This yields 


10.00000000000000000000000 x 2%(ex~p) 
x 10.00000000000000000000000 x 2b(expP) 


01.0000000000000000000000000000000000000000000000 x 2( (exp) +b(exp) +2) 
In floating-point multiplication, the exponent of the result may overflow when 
the exponents are initially added or when the exponent is modified during nor- 


mailization. 


Example 5—7. Floating-Point Multiply (Both Mantissas = 1.5) 


Let 
a = 1.5 x 20( exp) = 01.10000000000000000000000 x 2%(exP) 
b = 1.5 x 2b(exp) = 01.10000000000000000000000 x 25(exP) 


where a and b are both represented in binary form according to the single-pre- 
cision floating-point format. Then 


01.10000000000000000000000 x 2(expP) 
x 01.10000000000000000000000 x 2b(exp) 


0010.0100000000000000000000000000000000000000000000 x 2 («(exP) +b(exp)) 
To place this number in the proper normalized format, it is necessary to shift 
the mantissa one place to the right and add 1 to the exponent. This yields 


01.10000000000000000000000 x 2(exP) 
x 01.10000000000000000000000 x 2b(ex~p) 


01.00100000000000000000000000000000000000000000000 x 2 («(exp) +b( exp) +1) 
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Example 5-8. Floating-Point Multiply (Both Mantissas = 1.0) 
Let 
a = 1.0 x 2&(€XP) = 01.00000000000000000000000 x 2%( exp) 
b = 1.0 x 26(€xp) = 01.00000000000000000000000 x 26(exP) 
where a and b are both represented in binary form according to the single-pre- 
cision floating-point format. Then 
01.00000000000000000000000 x 2%(xP) 
x 01.00000000000000000000000 x 26(expP) 


0001.0000000000000000000000000000000000000000000000 x 2 («(exP) +b(exp)) 


This number is in the proper normalized format. Therefore, no shift of the man- 
tissa or modification of the exponent is necessary. 


The previous three examples show cases in which the product of two normal- 
ized numbers can be normalized with a shift of zero, one, or two. The float- 
ing-point format of the ’C4x makes this possible. 


Example 5-9. Floating-Point Multiply Between Positive and Negative Numbers 
Let 
a = 1.0 x 2&(€XP) = 01.00000000000000000000000 x 2%( xP) 
b =—2.0 x 2b(€xP) = 10.00000000000000000000000 x 2b(expP) 
Then 
01.00000000000000000000000 = x =2(exp) 
x 10.00000000000000000000000 x 2b(exp) 


The result is c =—2.0 x 2(a(exp) + b(exp)) 
Floating-Point Multiply by Zero 


All multiplications by a floating-point zero yield a result of zero (f=0, s=0, and 
exp = —128). 
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5.6 Floating-Point Addition and Subtraction 


In floating-point addition and subtraction, two floating-point numbers a and b 
can be defined as 

a = (man) x 2 &( XP) 

b = b(man) x 2 b(exp) 


The sum (or difference) of « and b can be defined as 
c=atb 
= (a(man) + (b(man) x 2 —(a(exp)—b(exp)))) x 2 a( exp), 
if a(exp) = b(exp) 
= ((a(man) x 2 —(b(exp)—a(exp))) + b(man)) x 2 (exp), 
if o(exp) < b(exp) 


Figure 5—16 is the flowchart for floating-point addition. Because this flowchart 
assumes signed data, itis also appropriate for floating-point subtraction. In this 
figure, it is assumed that a(exp) < b(exp). Steps are shown as numbers in pa- 
rentheses in the figure. 


Lj Instep 1, the source exponents are compared, and c(exp) is set equal to 
the largest of the two source exponents. 


Lj Instep 2, dis set to the difference of the two exponents. 


Lj In step 3, the mantissa with the smallest exponent, in this case a(man), 
is right-shifted d bits in order to align the mantissas. 


Li In step 4, after the mantissas have been aligned, they are added. 


[J Insteps 5 through 7 check for a special case of c(man). If c(man) is zero 
(step 5), then c(exp) is set to its most negative value (step 8) to yield the 
correct representation of zero. If c(man) has overflowed c (step 6), then 
in step 9, c(man) is right-shifted one bit, and 1 is added to c(exp). In step 
10, the result is normalized. 


(1 Insteps 11 and 12, special cases of c(exp) are tested. If c(exp) has over- 
flowed, then cis set to the most positive extended-precision value if it is 
positive; if it is negative, it is set to the most negative extended-precision 
value. 
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Figure 5—16. Flowchart for Floating-Point Addition 


a(man) b(man) a(exp) b(exp) 


Compare exponents 
If a(exp) < = b(exp) 
c(exp) = b(exp) 
else 


Align mantissas 
a(man) = a(man) >>d 


c(exp) = a(exp) 
[Assume for simplicity 
that a(exp) < = b(exp)] 
Discard LSBs to keep 
(man) in 
extended-precision (2) Subtract exponents 
floating-point format d = b(exp) + a(exp) 


(4) Add mantissas 
c(man) = a(man) + b(man) 


Test for special cases of c(man) 
(6) (7) 
k = # leading 
Overflow of c(man) nonsignificant 
sign bits 


c(man) = c(man) > > 1 
c(exp) = c(exp) + 1 
Discard LSBs to keep in 
extended-precision 
floating-point format 


(8) 
c(exp) = -128 


Test for special cases of c(exp) 


(11) (12) (13) 
c(exp) overflow c(exp) underflow c(exp) in range 


If c(man) > 0, set c to zero 


c(exp) = —-128 
c(man) = 0 


set c to most 
positive value. 
If c(man) <0, 
set c to most 
negative value. 


Set c to final result 


c=a+b 


The following examples describe the floating-point addition and subtraction 
operations. It is assumed that the data is in the extended-precision 
floating-point format. 
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Example 5—10. Floating-Point Addition 
Let 
a = 1.5 = 01.1000000000000000000000000000000 x 29 
b = 0.5 = 01.0000000000000000000000000000000 x 2-1 


It is necessary to shift b to the right by one so that a and b have the same expo- 
nent. This yields 


b = 0.5 = 00.1000000000000000000000000000000 x 2° 


Then 


01.10000000000000000000000000000000 x 22 
+ 00.10000000000000000000000000000000 x 29 


010.00000000000000000000000000000000 x 29 


As inthe case of multiplication, itis necessary to shift the binary point one place 
to the left and to add 1 to the exponent. This yields 


01.10000000000000000000000000000000 x 29 
+ 00.10000000000000000000000000000000 x 20 


01.00000000000000000000000000000000 x 21 


Example 5—11. Floating-Point Subtraction 
Let 


a = 01.0000000000000000000000000000001 x 2° 
b = 01.0000000000000000000000000000000 x 20 


The operation to be performed is a—b. The mantissas are already aligned be- 
cause the two numbers have the same exponent. The result is a large cancel- 
lation of the upper bits, as shown below. 


01.0000000000000000000000000000001 x 20 
— 01.0000000000000000000000000000000 x 29 


00.0000000000000000000000000000001 x 29 
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The result must be normalized. In this case, a left shift of 31 is required. The 
exponent of the result is modified accordingly. The result is 


01.0000000000000000000000000000001 x 20 
— 01.0000000000000000000000000000000 x 20 


01.0000000000000000000000000000000 x 2-31 


Example 5-12. Floating-Point Addition With a 32-Bit Shift 


This example illustrates a situation in which a full 32-bit shift is necessary to 
normalize the result. Let 


O = 01.1111111111111111111111111111111 x 2127 
b = 10.0000000000000000000000000000000 x 2127 
The operation to be performed is a + b. 


01.1111111111111111111111111111111 x 2127 
+  10.0000000000000000000000000000000 x 2127 
44.4911111111111111111111111111111 x 2127 


Normalizing the result requires a left shift of 32 and a subtraction of 32 from 
the exponent. The result is 


01.4111111111111111111111111111111 x 2127 
+  10.0000000000000000000000000000000 x 2127 
10.0000000000000000000000000000000 x 295 


Example 5-13. Floating-Point Addition/Subtraction and Zero 


When floating-point addition and subtraction are performed with a float- 
ing-point 0, the following identities are satisfied: 


a+0 =a (assuming that o # 0) 
0+0=0 


0 -—a=-—a (assuming that a # 0) 
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5.7 Normalization (NORM Instruction) 


The NORM instruction normalizes an extended-precision floating-point num- 
ber that is assumed to be unnormalized. Since the number is assumed to be 
unnormalized, no implied most significant nonsign bit is assumed. The NORM 
instruction executes three steps: 


1) Locates the most significant nonsign bit of the floating-point number 
2) Left shifts to normalize the number 
3) Adjusts the exponent 


Given the extended-precision floating-point value to be normalized, the nor- 
malization is performed as shown in Figure 5-17. 


Figure 5—17. Flowchart for NORM Instruction Operation 


Test for special cases of c (man) 


(2) 
Leading nonsignificant 
sign bits 


k = # leading 
nonsignificant 
sign bits 
c(exp) = —-128 : . 
Sign-expended o(man) 1 bit 
c(man) = a(man) < <k 

c (exp) = a(exp) —k 


Remove most significant nonsign bit 


Test for special cases of c (exp) 


(6) (7) 
c (exp) c (exp) in 
underflow range 


c(exp) = —128 
No change to c (man) 


c = norm(a) 
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Example 5—14. NORM Instruction 
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Assume that an extended-precision register contains the value: 
man = 00000000000000000001000000000001, exp = 0 


When the normalization is performed on a number assumed to be unnormal- 
ized, the binary point is assumed to be: 


man = 0.0000000000000000001000000000001, exp = 0 


This number is then sign extended one bit so that the mantissa contains 33 
bits: 


man = 00.0000000000000000001000000000001, exp = 0 


Here is the intermediate result after the most significant nonsign bit is located 
and the shift is performed: 


man = 01.0000000000010000000000000000000, exp = -19 
The final 32-bit value output after removing the redundant bit is: 
man = 00000000000010000000000000000000, exp = —-19 


The NORM instruction is useful for counting the number of leading zeros or 
leading ones in a 32-bit field. If the exponent is initially zero, the absolute value 
of the final value of the exponent is the number of leading ones or zeros. This 
instruction is also useful for manipulating unnormalized floating-point num- 
bers. 


Rounding (RND Instruction) 


5.8 Rounding (RND Instruction) 


The RND instruction rounds a number from the extended-precision float- 
ing-point format to the single-precision floating-point format in a single cycle. 
Rounding (rnd) is similar to floating-point addition. Given the number « to be 
rounded, the following operation is performed first. 


c = a(man) x 20(€Xp) + (1 x 20(exp)—24) 


Next, a conversion from extended-precision floating-point to single-precision 
floating-point format is performed. Given the extended-precision floating-point 
value, rounding is performed as shown in Figure 5—18. 


a TY | 
Note: 


RND src, dst — where (src) = 0 — does not set the zero condition flag (bit 
2 in the status register). Instead, it sets the underflow condition flag (bit 4 in 
the status register). When required, check for the underflow condition 


instead of the zero condition. 
fe | 
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Figure 5—18. Flowchart for Floating-Point Rounding by the RND Instruction 


exp) —24 


ie 1x 2% 


Add o(man) and 1/2 an LSB 


c(man) = 0(man) + 2-24 


Test for special cases of c(man) 


c(man) =0 Overflow of c(man) No special case 


c(man) = c(man) << 1 
c(exp) = a(exp) + 1 


Test for special cases of c (exp) 


c (exp) overflow c(exp) in range 


If c(man) > 0, 

set c to most positive 
single-precision value. 
If c(man) < 0, 

set c to most negative 
single-precision value. 


Set 8 LSBs of c(man) to zero 


c =rnd(a) 
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5.9 Floating-Point-to-Integer Conversion (FIX Instruction) 


Using the FIX instruction, you can convert an extended-precision float- 
ing-point number to a single-precision integer in a single cycle. The float- 
ing-point to integer conversion of the value x is referred to here as fix(x). The 
conversion does not overflow if a, the number to be converted, is in the range 


—231<q@< 231-1 
First, you must be certain that: 
a(exp) < 30 


If these bounds are not met, an overflow occurs. If an overflow occurs in the 
positive direction, the output is the most positive integer. If an overflow occurs 
in the negative direction, the output is the most negative integer. If a(exp) is 
within the valid range, then o(man), with implied bit included, is sign-extended 
and right-shifted (rs) by the amount: 


rs = 31 — a(exp) 

This right shift (rs) shifts out those bits corresponding to the fractional part of 
the mantissa. For example: 

IfO <x <1, then fix(x) = 0. 

If -1 <x <0, then fix(x) = -1. 


The flowchart for the floating-point-to-integer conversion is shown in 
Figure 5-19. 
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Figure 5—19. Flowchart for Floating-Point-to-Integer Conversion by FIX Instruction 


Qa 


Test for special cases of a(exp) 


a(exp) in range 
rs = 31 — a(exp) 


a(exp) > 30 


Overflow Shift 


If (man) > 0, c = (man) > > rs 
c = most positive integer 

If o(man) < 0, 
c = most negative integer 


Set c to final result 


c = fix(a) 
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5.10 Integer-to-Floating-Point Conversion (FLOAT Instruction) 


Integer-to-floating-point conversion performed by the FLOAT instruction al- 
lows a single-precision integer to be converted to an extended-precision float- 
ing-point number in a single cycle. The flowchart for this conversion is shown 
in Figure 5-20. 


Figure 5—20. Flowchart for Integer-to-Floating-Point Conversion by FLOAT Instructions 


Test for special cases of c (man) 


Leading nonsignificant 
c(man) =0 sign bits. 


k = # leading 
nonsignificant 
Vv sign bits 
c(exp) =-128 c(man) = c(man) <<k 
c(exp) = 30 —k 
Remove most significant nonsign bit 
Vv 


Set c to final result 


c = float (a) 
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5.11 Reciprocal (RCPF Instruction) 


The RCPF instruction generates a satisfactory estimate of the reciprocal of a 
floating-point number in a single cycle. The estimate has the correct exponent, 
and the mantissa is accurate to the eighth binary position (mantissa error is 
thus < 2-8) giving a 16-bit representation of the result (8-bit exponent plus 8-bit 
mantissa). Also, this estimate can be used as a seed for an algorithm to com- 
pute the reciprocal to even greater accuracy. (The Newton-Raphson algo- 


rithm, described in this section, is one such case.) 


Figure 5-21 below depicts the algorithm used by instruction RCPF. 


L) 


a 


vexp is negated. 


Figure 5-21. RCPF Instruction Algorithm 
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vexp 


Vv 


Negate vexp 


xexp 


If vexp = —128, set overflow flag and 
saturate to most positive number. 


Lj The input is assumed to be v= vman x 2eXP, 


Li The output is assumed to be x = xman x 2X©XP, 


If vexp =—128, the result is saturated to the most positive number, and the 
overflow flag is set. The N condition flag is set to the same sign as vsign. 


vsign vfrac(22 — 15) 


Look-up Table 
(512 x 8) 


Xfrac(22 — 15) 


Form xman. 
xfrac(14.. 0) =0 
xsign = vsign 


xman 


Reciprocal (RCPF Instruction) 


The look-up table is read by forming a nine-bit address consisting of vsignand 
bits 22—15 of vfrac. The eight-bit output of the look-up table forms bits 22-15 
of xfrac. Bits 14—0 of xfrac are cleared to zero. xsign is set to vsign. 


The look-up table values are generated from simulation results. 


5.11.1 Reciprocal Algorithm 


The RCPF instruction provides the reciprocal of a number. The estimate has 
the correct exponent and a mantissa accurate to the eighth binary place (i.e., 
the error of the mantissa is < 2-8). The Newton-Raphson algorithm (shown be- 
low) can be used to further extend the mantissa’s precision: 


x[n+1] = x[n](2 — vx{n]) 


where v = the number whose reciprocal is to be found. 


x[0], the seed for the algorithm, is given by RCPF. For each iteration of the algo- 
rithm, the number of accurate bits in the mantissa doubles. Using RCPF, you 
can start with an estimate accurate to eight bits. With one iteration, accuracy 
is 16 bits in the mantissa, and with a second iteration, accuracy is 32 bits. 


The ’C4x program to implement this algorithm is shown in Example 5-15. 
Each step of the algorithm is labeled along with the corresponding accuracy 
achieved at the end of the step. The algorithm takes only seven machine 
cycles. 


Example 5—15. Newton-Raphson Algorithm for Computing the Reciprocal 


RCPF RO,RlL ; RO = v, RL = x[0] 


MPYF R1,R0,R2 
SUBRF 2.0,R2 
MPYF R2,R1 ; end of first iteration (16-bit accuracy) 


MPYF R1,RO0,R2 
SUBRF 2.0,R2 
MPYF R2,Rl1 ; end of second iteration (32-bit accuracy) 


; ; R1 = 1/v 
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5.12 Reciprocal Square Root (RSQRF Instruction) 
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In many applications, normalization of data values is necessary. Often, the 
normalizing factor is the square root of another quantity. For example, when 
one vector is given, you can find the unit vector in the same direction by divid- 
ing the original vector by its own length. This involves division by a square root. 
The RSQRF instruction provides a simple way to directly determine this quan- 
tity instead of going through a two-step approach of finding the square root and 
then finding the reciprocal of the square root. 


Given the result of this algorithm, the square root is found by a simple multi- 
plication: 


Vv = vx{[n] 


le 
where x(n] is the estimate of /y as determined by the Newton-Raphson algo- 
rithm or some other algorithm. 


The RSQHRF instruction generates an estimated reciprocal of the square root 
of a floating-point number in a single cycle. It parallels some of the operational 
characteristics of the RCPF instruction in these ways: 


[1 RSQRF generates an estimate (in this case, the reciprocal of the square 
root of a floating-point number). 


_j The mantissa is accurate to the eighth binary place (mantissa error is 
< 2-8). 


Lj Often, this is a satisfactory estimate of the reciprocal of anumber’s square 
root; in other cases, it may be used as a seed for an algorithm that com- 
putes the reciprocal square root to an even greater accuracy. 


Figure 5-22 depicts the RSQRF algorithm. In the algorithm: 
Lj The input is assumed to be v= vman x 2VEXxPp. 


The output is assumed to be x = xman x 2XxXP. 


= 
L] vexp+ 1 is negated and shifted right one bit with sign extension. 
i 


If vexp =—128, the result is saturated to the most positive number, and the 
overflow flag is set. 


Reciprocal Square Root (RSQRF Instruction) 


Figure 5-22. RSQREF Instruction Algorithm 
vexp vexp(0) vfrac(22.. 15) 


—(vexp +1) shifted Look-up Table 
right one bit and (512 x 8) 
sign extended 


Xfrac(22.. 15) 


Form xman. 


xfrac(14... 0) =0 
xsign =0 


Vv xexp ! xman 


If vexp = —128, set overflow flag and 
saturate to most positive number. 


x 


The look-up table is read by forming a nine-bit address consisting of the least 
significant bit of vexp and bits 22—15 of vfrac. The eight-bit output of the look- 
up table forms bits 22—15 of xfrac. Bits 14—0 of xfracare cleared to zero. xsign 
is set to 0. There is no provision for negative values of v. 


The look-up table values are generated from simulation results. 


Given the result of this algorithm, division is performed by a simple multiplica- 
tion: 
yv = yx{n| 


In the equation, x[n] is the estimate of 1/vas determined by the Newton-Raph- 
son algorithm or another algorithm. 


Newton-Raphson Algorithm 


The RSQRF instruction provides the reciprocal of the square root of anumber. 
The estimate has the correct exponent and a mantissa accurate to the eighth 
binary place (i.e., the error of the mantissa is < 2-8). The Newton-Raphson al- 
gorithm (shown below) can be used to further extend the mantissa’s precision: 


x[n+1] = x[n](1.5—(w/2)x[n]x{[n]) 


where v = the number whose reciprocal is to be found. 
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The seed for the algorithm, x0], is given by RSQRF. For each iteration of the 
algorithm, the number of accurate bits in the mantissa doubles. Using RSQRF, 
you can start with an estimate accurate to eight bits. With one iteration, accura- 
cy is 16 bits in the mantissa, and with a second iteration, accuracy is 32 bits. 


The ’C4x program to implement this algorithm is shown in Example 5-16. 
Each step of the algorithm is labeled, and the corresponding accuracy 
achieved is noted at the end of the step. The algorithm takes only ten machine 
cycles (compared to 30 cycles on the ’C3x without a look-up table). 


Example 5—16. Newton-Raphson Algorithm for Computing the Reciprocal Square Root 


RSORF RO,R1 ; RO 
PYF 0.5,RO; RO 


Vy 
v/2 


PYF R1,R1,R2 
PYF RO,R2 
SUBRF 1.5,R2 


PYF R1,R1,R2 
PYF RO,R2 
SUBRF 1.5,R2 


RL 


x [0] 


PYF R2,R1 ; end of first iteration 


PYF R2,R1 ; end of second iteration 


; ; RI = 1/ (vw**0.5) 


(16-bit accuracy) 


(32-bit accuracy) 
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Chapter 6 


Addressing Modes 


The ’C4x supports five types of addressing to access data from memory, regis- 
ters, and the instruction word. This chapter details the operation, encoding, 
and implementation of the addressing modes. 
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Addressing Types 


6.1 Addressing Types 


6-2 


You can access data from memory, registers, and the instruction word by using 
five types of addressing: 


Register addressing 
Direct addressing 
Indirect addressing 
Immediate addressing 
PC-relative addressing 


OOUUOU 


Not all addressing types are appropriate for all instructions. Addressing types 
are classified into four groups, depending upon the encoding method used: 


[1 General addressing modes (G) 

Lj Three-operand addressing modes (T) 

Lj Parallel addressing modes (P) 

[J Conditional-branch addressing modes (B) 


For use in filters and FFTs, there are two specialized modes: 


.j Circular addressing 
_j Bit-reversed addressing 


Register Addressing 


6.2 Register Addressing 


In register addressing, a CPU register contains the operand, as shown in this 
example: 


ABSF R1 ; R1 = |R1| 


The machine address for the CPU registers, the assembler syntax (register 
name), and the assigned function for those registers are listed in Table 6-1. 


Table 6—1. CPU Register/Assembler Syntax and Function 


(a) CPU Primary Registers 


Register Machine Assigned 
Name Address Function 

RO 00h Extended-precision register 0 
R1 Oth Extended-precision register 1 
R2 02h Extended-precision register 2 
R3 03h Extended-precision register 3 
R4 04h Extended-precision register 4 
R5 05h Extended-precision register 5 
R6 06h Extended-precision register 6 
R7 07h Extended-precision register 7 
R8 1Ch Extended-precision register 8 
RQ 1Dh Extended-precision register 9 
R10 1Eh Extended-precision register 10 
R11 1Fh Extended-precision register 11 
AO 08h Auxiliary register 0 

Al 09h Auxiliary register 1 

A2 OAh Auxiliary register 2 

A3 OBh Auxiliary register 3 

A4 0Ch Auxiliary register 4 

A5 ODh Auxiliary register 5 

A6 OEh Auxiliary register 6 
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Register Addressing 


Table 6-1. CPU Register/Assembler Syntax and Function (Continued) 


Register Machine Assigned 
Name Address Function 

A7 OFh Auxiliary register 7 
DP 10h Data-page pointer 
IRO 11h Index register 0 
IR1 12h Index register 1 
BK 13h Block-size register 
SP 14h Active stack pointer 
ST 15h Status register 
DIE 16h DMA coprocessor interrupt enable 
NE 17h Internal interrupt enable register 
IIF 18h IIOF pins and interrupt flag register 
RS 19h Repeat start address register 
RE 1Ah Repeat end address register 
RC 1Bh Repeat counter register 


(b) CPU Expansion Registers 


Register Machine Assigned 
Name Address Function 
IVTP 00h Interrupt-vector table pointer 
TVTP Oth Trap-vector table pointer 
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Direct Addressing 


6.3 Direct Addressing 


In direct addressing, the data address is formed by the concatenation of the 
16 least significant bits of the data page pointer (DP) with the 16 least signifi- 
cant bits of the instruction word (expr). The use of 16 bits for the DP results in 
65536 pages (64K words per page), allowing you to access a large address 
space without changing the value of the DP. The syntax and operation for di- 
rect addressing are listed below. 


Syntax: @expr 
Operation: address = DP concatenated with expr 


Figure 6—1 shows the formation of the data address. Example 6—1 gives an 
instruction example with data before and after instruction execution. 


Figure 6—1. Direct Addressing 


Instruction word 


(Data page pointer) 


31 0 


Example 6—1. Direct Addressing 


ADDI @QOBCDEh, R7 


Before Instruction: After Instruction: 
DP = 108Ah DP = 108Ah 
R7 =11h R7 = 1234 5689h 


Data at 108A BCDEh = 1234 5678h Data at 108A BCDEh = 1234 5678h 
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Indirect Addressing 


6.4 Indirect Addressing 


Indirect addressing specifies the address of an operand in memory through 
the contents of an auxiliary register, optional displacements, and index regis- 
ters. The auxiliary register arithmetic units (ARAUs) perform this unsigned 
arithmetic. (All 32 bits of the auxiliary and index registers are used in indirect 
addressing.) 


The flexibility of indirect addressing is possible because the ARAUs on the 
’C4x modify auxiliary registers in parallel with operations within the main CPU. 
Indirect addressing is specified by a five-bit field in the instruction word, re- 
ferred to as the mod field (shown on the left side of Table 6—2 on as well as in 
the examples that follow). A displacement is either an explicit unsigned 5-bit 
or 8-bit integer contained in the instruction word or an implicit displacement of 
one. Two index registers, IRO and IR1, can also be used in indirect addressing, 
enabling the use of 32-bit indirect displacements (IRO and IR1 are treated as 
signed integers). In some cases, an addressing scheme using circular or bit- 
reversed addressing is optional. Generating addresses for circular addressing 
is discussed in Section 6.8, and for bit-reversed addressing in Section 6.9. 


Table 6—2 lists the various kinds of indirect addressing, along with the value 
of the modification (mod) field, assembler syntax, operation, and function for 
each. Figure 6—2 shows the format of the indirect addressing operand in the 
instruction encoding. The alsp field does not exist for some instructions. 


Figure 6-2. Indirect Addressing Operand Encoding 
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MSB LSB 
5 bits 3 bits 0, 5, or 8 bits 


aE 


Note: 


The auxiliary register (ARn) to be used is encoded in the instruction word ac- 
cording to its binary representation, n (i.e., AR3 is encoded as 119), not its 
register machine address (as shown in Table 6—1). 


ss | 


Table 6-2. Indirect Addressing 
(a) Indirect Addressing With Displacement 


Mod Field 


00000 
00001 


00010 


00011 


00100 


00101 


00110 


00111 


Syntax 
*+ARn(disp) 
*_ARn(disp) 


*++ARn(disp) 


*_—ARn(disp) 


*ARn++(disp) 


*ARn——(disp) 


*ARn++(disp)% 


*ARn——(disp)% 


Operation 
addr = ARn + disp 
addr = ARn — disp 


addr = ARn + disp 
ARn = ARn + disp 


addr = ARn — disp 
ARn = ARn - disp 
addr = ARn 

ARn = ARn + disp 


addr = ARn 

ARn = ARn — disp 
addr = ARn 

ARn = circ(ARn + disp) 


add = ARn 
ARn = circ(ARn — disp) 


(b) Indirect Addressing With Index Register [RO 


Mod Field 


01000 
01001 


01010 


01011 


01100 


01101 


01110 


01111 


LEGEND: 


addr 
ARn 
IRn 
disp 
++ 


Syntax 
*+ARn(IRO) 
*_ARn(IRO) 


*44ARn(IRO) 


*_—ARn(IRO) 


*ARn++(IRO) 


*ARn——(IRO) 


*ARn++(IRO)% 


*ARn——(IRO)% 


memory address 


auxiliary register ARO — AR7 


Operation 
addr = ARn + IRO 
addr = ARn — IRO 


addr = ARn + IRO 
ARn = ARn + IRO 


addr = ARn — IRO 
ARn = ARn — IRO 


addr = ARn 
ARn = ARn + IRO 


addr= ARn 
ARn = ARn — IRO 


addr = ARn 
ARn = circ(ARn + IRO) 


addr = ARn 
ARn = circ(ARn-— IRO) 


index register IRO or IR1 % 


displacement 
add and modify 


Indirect Addressing 


Description 
With predisplacement add 


With predisplacement subtract 


With predisplacement add and modify 


With predisplacement subtract and modify 


With postdisplacement add and modify 


With postdisplacement subtract and modify 


With postdisplacement add and circular 
modify 


With postdisplacement subtract and circular 
modify 


Description 
With preindex (IRO) add 
With preindex (IRO) subtract 


With preindex (IRO) add and modify 


With preindex (IRO) subtract and modify 


With postindex (IRO) add and modify 


With postindex (IRO) subtract and modify 


With postindex (IRO) add and circular modify 


With postindex (IRO) subtract and circular 
modify 


subtract and modify 

address in circular addressing 

where circular addressing is performed 
where bit-reversed addressing is performed 
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Indirect Addressing 


Table 6-2. Indirect Addressing (Continued) 


(c) Indirect Addressing With Index Register IR1 


Mod Field Syntax Operation 
10000 *+ARn(IR1) addr = ARn + IR1 
10001 *—ARn(IR1) addr = ARn -IR1 
* addr = ARn + IR1 
10010 ++ARn(IR1) ARn = ARn +IR4 
het addr = ARn -IR1 
10011 ARn(IR1) ARn = ARn—IR1 
* addr = ARn 
10100 ARn++(IR1) ARn = ARn +IR4 
‘¢ — addr = ARn 
10101 ARn——(IR1) ARn = ARn —IR1 
10110 ‘ARnaedRIs “oe 


10111 


*ARn——(IR1)% 


ARn = circ(ARn + IR1) 


addr = ARn 
ARn = circ(ARn — IR1) 


(d) Indirect Addressing (Special Cases) 


Mod Field Syntax 


11000 


11001 


LEGEND: 


addr 
ARn 
IRn 
disp 
++ 
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*ARn 


*ARn++(IRO)B 


memory address 


Operation 
addr = ARn 


addr = ARn 
ARn = B(ARn + IRO) 


auxiliary register ARO — AR7 circ() 
index register IRO or IR1 % 
displacement B 


add and modify 


Description 
With preindex (IR1) add 


With preindex (IR1) subtract 


With preindex (IR1) add and modify 


With preindex (IR1) subtract and modify 


With postindex (IR1) add and modify 


With postindex (IR1) subtract and modify 


With postindex (IR1) add and circular modify 


With postindex (IR1) subtract and circular 
modify 


Description 


Indirect 


With postindex (IRO) add and bit-reversed 
modify 


subtract and modify 

address in circular addressing 

where circular addressing is performed 
where bit-reversed addressing is performed 


Indirect Addressing 
Example 6—2 through Example 6—19 show the operation for each type of indi- 
rect addressing. 


Example 6-2. Auxiliary Register Indirect 


An auxiliary register (ARn) contains the address of the operand to be fetched. 


Operation: operand address = ARn 
Assembler Syntax: *“ARn 
Modification Field: 11000 
31 0 
1 0 


3 
Operand 


Example 6—3. Indirect With Predisplacement Add 


The address of the operand to be fetched is the sum of an auxiliary register 
(ARn) and the displacement (disp). The displacement is either a 5-bit or 8-bit 
unsigned integer contained in the instruction word or an implied value of 1. 
Operation: operand address = ARn+ disp 
Assembler Syntax: *+ARn(disp) 

Modification Field: 00000 


31 0 


31 0 


Remaining 27 or 24 bits are zero filled VQ (+) 


7 4 #0 


8-bit or 5-bit unsigned integer displacement { ——, 


31 0 
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Indirect Addressing 


Example 6-4. Indirect With Predisplacement Subtract 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn) minus the displacement (disp). The displacement is either an 8-bit un- 
signed integer contained in the instruction word or an implied value of 1. 


Operation: operand address = ARn-disp 
Assembler Syntax: *_ARn(disp) 
Modification Field: 00001 
31 0 
31 8 7 0 
disp} 0 0....0 0] Integer | OQ) 
31 0 


Example 6-5. Indirect With Predisplacement Add and Modify 


The address of the operand to be fetched is the sum of an auxiliary register 
(ARn) and the displacement (disp). The displacement is either an 8-bit un- 
signed integer contained in the instruction word or an implied value of 1. After 
the data is fetched, the auxiliary register is updated with the generated ad- 
dress. 


Operation: operand address = ARn+disp 
ARn=ARn + disp 
Assembler Syntax: *++ ARn(disp) 
Modification Field: 00010 
31 0 


ARn Address 
31 
asp | 0 0.20 0] __integer|—> 


Indirect Addressing 


Example 6-6. Indirect With Predisplacement Subtract and Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn) minus the displacement (disp). The displacement is either an 8-bit un- 
signed integer contained in the instruction word or an implied value of 1. After 
the data is fetched, the auxiliary register is updated with the generated ad- 


dress. 
Operation: operand address = ARn-disp 
ARn = AR n-disp 

Assembler Syntax: *—ARn(disp) 
Modification Field: 00011 

31 0 

31 8 7 0 
31 0 


Example 6-7. Indirect With Postdisplacement Add and Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, the displacement (disp) is added to the 
auxiliary register. The displacement is either an 8-bit unsigned integer con- 
tained in the instruction word or an implied value of 1. 


Operation: operand address = ARn 
ARn= ARn+ disp 
Assembler Syntax: *“ARn ++ disp 
Modification Field: 00100 
31 0 
31 8 7 0 


31 
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Example 6-8. Indirect With Postdisplacement Subtract and Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, the displacement (disp) is subtracted from 
the auxiliary register. The displacement is either an 8-bit unsigned integer con- 
tained in the instruction word or an implied value of 1. 


Operation: operand address = ARn 

ARn = ARn -disp 
Assembler Syntax: *“ARn —- disp 
Modification Field: 00101 


31 0 


Example 6-9. Indirect With Postdisplacement Add and Circular Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, the displacement (disp) is added to the 
contents of the auxiliary register through circular addressing. This result is 
used to update the auxiliary register. The displacement is either an 8-bit un- 
signed integer contained in the instruction word or an implied value of 1. 


Operation: operand address = ARn 
ARn = circ(ARn + disp) 
Assembler Syntax: *ARn ++(disp)% 
Modification Field: 00110 
31 0 


Address 


ARn 


31 8 7 0 
31 


Indirect Addressing 


Example 6-10. Indirect With Postdisplacement Subtract and Circular Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, the displacement (disp) is subtracted from 
the contents of the auxiliary register through circular addressing. This result 
is used to update the auxiliary register. The displacement is either an 8-bit un- 
signed integer contained in the instruction word or an implied value of 1. 


Operation: operand address = ARn 
ARn = circ(ARn-—disp) 
Assembler Syntax: *ARn——(disp)% 
Modification Field: 00111 
31 0 


ARn 


31 8 7 0 


aol 0 = all Reel 


31 0 
Operand 


Example 6—11. Indirect With Preindex Add 


The address of the operand to be fetched is the sum of an auxiliary register 
(ARn) and an index register (IRO or IR1). 


Operation: operand address = ARn+IR m 
Assembler Syntax: *+ARn(IRm) 
Modification Field: 01000 ifm=0 
10000 ifm=1 
31 0 


31 0 


31 0 
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Indirect Addressing 


Example 6-12. Indirect With Preindex Subtract 


The address of the operand to be fetched is the difference between an auxiliary 
register (ARn) and an index register (IRO or IR1). 


Operation: operand address = ARn-IRm 
Assembler Syntax: *—ARn(IRm) 
Modification Field: 01001 if m=0 
10001 if m=1 
31 0 


Arn 
0 


Example 6-13. Indirect With Preindex Add and Modify 


The address of the operand to be fetched is the sum of an auxiliary register 
(ARn) and an index register (IRO or IR1). After the data is fetched, the auxiliary 
register is updated with the generated address. 


Operation: operand address = ARn+IRm 
ARn=ARn+IRm 
Assembler syntax: *++ARn(IRm) 
Modification Field: 01010 if m=0 
10010 if m=1 
31 0 
ARn 


Indirect Addressing 


Example 6—14. Indirect With Preindex Subtract and Modify 


The address of the operand to be fetched is the difference between an auxiliary 
register (ARn) and an index register (IRO or IR1). The resulting address be- 
comes the new contents of the auxiliary register. 


Operation: operand address = ARn—IRm 
ARn = ARn-IRm 

Assembler Syntax: *—-AR n(IRm) 

Modification Field: 01011 if m=0 
10011 ifm=1 


31 0 


Example 6—15. Indirect With Postindex Add and Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, an index register (IRO or IR1) is added to 
the auxiliary register. 


Operation: operand address = ARn 
ARn=ARn+IRm 

Assembler Syntax: *ARn++(IRim) 

Modification Field: 01100 if m=0 
10100 ifm=1 


31 0 
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Example 6—16. Indirect With Postindex Subtract and Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, the index register (IRO or IR1) is sub- 
tracted from the auxiliary register. 


Operation: operand address = ARn 
ARn=ARn-IRm 
Assembler Syntax: *ARn——(IRm) 
Modification Field: 01101 ifm=0 
10101 ifm =1 
31 0 


Example 6-17. Indirect With Postindex Add and Circular Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, the index register (IRO or IR1) is added 
to the auxiliary register. This value is evaluated through circular addressing 
and replaces the contents of the auxiliary register. 


Operation: operand address = ARn 
ARns= circ(ARn + IRm) 
Assembler Syntax: *“ARn++(IRm)% 
Modification Field: 01110 if m=0 
10110 if m=1 
31 0 


ARn 


Indirect Addressing 


Example 6—18. Indirect With Postindex Subtract and Circular Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, the index register (IRO or IR1) is sub- 
tracted from the auxiliary register. The result is evaluated through circular ad- 
dressing and replaces the contents of the auxiliary register. 


Operation: operand address = ARn 
ARn= circ(ARn—IRm) 

Assembler Syntax: *“ARn—-(IRm)% 

Modification Field: 01111 ifm=0 
10111 ifm=1 


Operand 


Example 6-19. Indirect With Postindex Add and Bit-Reversed Modify 


The address of the operand to be fetched is the contents of an auxiliary register 
(ARn). After the operand is fetched, the index register (IRO) is added to the 
auxiliary register. This addition is performed with a reverse-carry propagation 
and can be used to yield a bit-reversed (B) address. This value replaces the 
contents of the auxiliary register. 


Operation: operand address = ARn 

ARn = B(ARn+IRO0) 
Assembler Syntax: *ARn++(IR0O)B 
Modification Field: 11001 


0 
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Immediate Addressing 


In immediate addressing, the operand is an 8- or 16-bit immediate value con- 
tained in the 8 or 16 least significant bits of the instruction word (expr). Depend- 
ing on the data types assumed for the instruction, the immediate operand may 
be a twos-complement integer, an unsigned integer, a signed integer, ora floa- 


ting-point number. The syntax for this mode is as follows: 


Syntax: 


Example 6-20 gives an instruction example with data from before and after 
the instruction is executed. Notice that AND and AND3 produce different re- 


sults. 


Example 6-20. Immediate Addressing 


Instruction 

SUBI 1,R0 

LDI OFFFFh, RO 
LDF 5.0,RO0 

OR OFFFFh, RO 
AND3  80h,RO,RO 
AND 80h, RO 


expr 


Before 
RO=0h 
RO=0h 
RO=0h 
RO=0h 


RO=00 FE 
RO=00 FE 


FEEFA 


FEFA 


After 
RO=00 


RO=00 FE 


RO=02 
RO=00 
RO=00 
RO=00 


FREE FFFFh 
FEFFF FFFFh 
2000 O0000h 
0000 FFFFh 
FFFF FF80h 
0000 O080h 


PC-Relative Addressing 


6.6 PC-Relative Addressing 


PC-relative addressing is used for branching. It adds the contents of the 16 or 
24 least significant bits of the instruction word to the PC register. The assem- 
bler takes the src (a label or address) specified by the user and generates a 
displacement. If the branch is a standard branch, this displacement is equal 
to [label — (instruction address +1)]. lf the branch is a delayed branch, this dis- 
placement is equal to [label — (instruction address+3)]. 


The displacement is stored as a 16-bit or 24-bit signed integer in the least sig- 
nificant bits of the instruction word. The displacement is added to the PC during 
the pipeline decode phase. Notice that because the PC is incremented by one 
in the fetch phase, the displacement is added to this incremented PC value. 


Syntax: expr (label or address) 


Example 6-21 gives an instruction example with before- and after-instruction 
data. 


Example 6-21. PC-Relative Addressing 


BU NEWPC ; address of BU instruction=1, 

woe ; NEWPC label =5, displacement = 3 
NEWPC ... ; displacement = 5 - (1 + 1) 
Before Instruction After Instruction 
Decode Phase Execution Phase: 
PC =2h PC = 5h 


The 24-bit addressing mode is used to encode the program control instruc- 
tions (e.g., BR, BRD, CALL, RPTB, RPTBD, LAJ). Depending on the instruc- 
tion, the new PC value is derived by adding a 24-bit signed value in the instruc- 
tion word with the present PC value. Bit 24 determines the type of branch (D 
= 0 for a standard branch or D = 1 fora delayed branch). Some of these instruc- 
tions are encoded in Figure 6-3. 
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Figure 6—3. Encoding for 24-Bit PC-Relative Addressing Mode 


(a) BR, BRD: unconditional branches (delayed and not delayed) 
31 25 23 


011000 0/D src 


(b) CALL: unconditional subroutine call 
31 23 


01100010 src 


(c) RPTB, RPTBD: repeat block (not delayed and delayed) 
23 


31 
011110 0/D src 


(d) LAJ: link and jump (return address in extended-precision 
register R11) 


23 


31 
0110001 41 src 
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6.7 Encoding of Addressing Modes 


The five addressing types form four groups of addressing modes: 
_] General addressing modes (G) (subsection 6.7.1) 

[_] Three-operand addressing modes (T) (Subsection 6.7.2) 

(_] Parallel addressing modes (P) (subsection 6.7.3) 


[_.] Conditional-branch addressing modes (B) (subsection 6.7.4) 


6.7.1 General Addressing Modes 


Instructions that use the general addressing modes are general-purpose in- 
structions, such as ADDI, MPYF, and LSH. Such instructions usually have the 
following syntax: 


dst operation src > dst 


In the syntax, the destination operand is signified by dst and the source oper- 
and by src; operation defines an operation to be performed with the general 
addressing modes to specify certain operands. Bits 31-29 are zero, indicating 
general addressing mode instructions. Bits 22 and 21 specify the general ad- 
dressing mode (G) field, which defines how bits 15 through 0 are to be inter- 
preted for addressing the src operand. 


Options for bits 22 and 21 (G field) are as follows: 


G Mode 

00 register (all CPU registers unless specified otherwise) 
01 direct 

10 indirect 

11 immediate 


If the src and dstfields contain register specifications, the value in these fields 
contains the CPU register addresses as defined by Table 6—1. For the general 
addressing modes, the following values of ARn are valid for indirect address- 


ing: 
ARn, 0 <n<7 


Figure 6—4 shows the encoding for the general addressing modes. The nota- 
tion modn indicates the modification field that goes with the ARn field. Refer 
toTable 6-2 for further information. 
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Figure 6—4. Encoding for General Addressing Modes 


| CG | Destination | Source Operands 
[3129 |2e aa jazz [20g | 5 tt fo 8 | 7 5 | 4 


fo ae] oan fo of a [eevee {ooe[ooe| oven 


foe] women fof mee 
eR a 6 


Immediate 


6.7.2 Three-Operand Addressing Modes 


The 19 three-operand instructions on the ’C4x use the eight addressing modes 
listed in Table 6-3: 


Table 6-3. Three-Operand Instruction Addressing Modes 


Type 1t 

T src1 addressing modes src2 addressing modes dst t 

00 Register mode (any CPU Register mode (any CPU Rx 
register) register) 

01 Indirect mode (disp = 0, 1, Register mode (any CPU Rx 
IRO, IR1) register) 

10 Register mode (any CPU Indirect mode (disp = 0, 1, Rx 
register) IRO, IR1) 

11 Indirect mode (disp = 0, 1, Indirect mode (disp = 0, 1, Rx 
IRO, IR1) IRO, IR1) 

Tt The ’C4x recognizes either type 1 or type 2 modes; the 'C3x recognizes only type 1. 
ERx = any register in the CPU (primary) register file for the respective processor. 
Type 2t 

T srci addressing modes src2 addressing modes dst # 

oo _—‘-Registermode (anyCPU ght signed immediate Rx 
register) 
Register mode (any CPU Indirect mode *+ARn(5-bit 

01 : : : Rx 
register) unsigned displacement) 

190 (Indirect mode “+ARN(S-bit _ ¢ bit signed immediate Rx 
unsigned displacement) 

11 Indirect mode *+ARn1(5-bit Indirect mode *+ARn2(5-bit Rx 


unsigned displacement) unsigned displacement) 
The ’C4x recognizes either type 1 or type 2 modes; the ’C3x recognizes only type 1. 


+ Rx = any register in the CPU (primary) register file for the respective processor. 
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The object values differ for three-operand instructions, depending on the as- 
sembler used: 


_} The ’C3x assembler recognizes only type 1 modes and sets bits 31—28 to 
0010b. 


[.] The ’C4x assembler recognizes both types and sets bits 31-28 to 00105 
for type 1 and to 00115 for type 2. 


The three-operand instructions MPYSHI3 and MPYUHI8 are unique to the 
Cx. 


All instructions except four can use all of the type 2 address modes shown in 
Table 6-3. The exceptions, which can use only the second and fourth address 
modes in type 2, are the floating-point instructions ADDF3, CMPF3, MPYF3, 
and SUBF3. 


The remaining 15 three-operand instructions are ADDC3, ADDI3, AND3, 
ANDN3, ASH3, CMPI3, LSH3, MPYI3, MPYSHI3, MPYUHI3, OR3, SUBB3, 
SUBI3, TSTB3, and XOR3. 


FS — ——————— — — — — — — aos — 00 oh, 
Note: 


The suffix 3 can be omitted from a three-operand instruction mnemonic. 


Bits 22 and 21 specify the three-operand addressing mode (T) field, which de- 
fines how to interpret bits 15—0 for addressing the src operands. Bits 15—8 de- 
fine the src7 address, and bits 7—0 define the src2 address. 


Figure 6—5 and Figure 6—6 show the encoding for ’C4x three-operand ad- 
dressing (the 'C3x recognizes only the format in Figure 6-5). The notation 
modm or moan indicates the modification field that goes with the ARmor ARn 
(auxiliary register) field, respectively. Refer to Table 6—2 for further informa- 
tion. 


The 8-bit signed immediate value supports left shifts, right shifts, and memory 
increment and decrement operations. The immediate value is not available for 
floating-point operations. 


These instructions greatly help reduce code size, both assembled and com- 
piled. They also improve performance notably in DSP and other computation- 
ally intensive applications and general-purpose code. 
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Figure 6—5. Encoding for Type 1 Three-Operand Addressing Modes (’C3x and ’C4x) 


peer 
fai 10 6[7 5]4 8 2 0| 
ee Oe) 


ARn 


2 0 


7 5 4 3 
dst AR 


6.7.3 Parallel Addressing Modes 


Instructions that use parallel addressing, indicated by || (two vertical bars), al- 
low for the greatest amount of parallelism possible. The destination operands 
are indicated as d1 and d2, signifying dst? and dsi2, respectively (see Figure 
6—4). The source operands, signified by src? and src2, use the extended-pre- 
cision registers. The parallel operation to be performed is called operation. 


Figure 6—7. Encoding for Parallel Multiply With ADD/SUB 


31 30 29 26 25 24 23 22 21 19 18 16 15 11. 10 


8 7 3 2 0 
[0] operation] P [or[ae] we | sc | moan | an [| moan | Ann | 


The parallel addressing mode (P) field specifies how to use the operands, i.e., 
whether they are source or destination. The specific relationship between the 
P field and the operands is detailed in the description of the individual parallel 
instructions (see Chapter 14 for more information). However, the operands are 
always encoded in the same way. Bits 31 and 30 are set to the value of 10, 
indicating parallel addressing mode instructions. Bits 25 and 24 specify the 
parallel addressing mode (P) field, which defines how bits 21— 0 are to be inter- 
preted for addressing the src operands. Bits 21—19 define the src? address, 
bits 18—16 define the src2 address, bits 15—8 the src3 address, and bits 7—0 
the src 4 address. The notations modn and mod indicate the modification 
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field that goes with the ARn or ARm (auxiliary register) field, respectively. The 
parallel addressing operands are listed below. 


srct=Rn (0<n<7 for extended-precision registers RO—R7) 
src2=Rn (0</n<7 for extended-precision registers RO—R7) 


d1 If 0, dst7 is RO. If 1, dst7 is R1. 
d2 If 0, dst2 is R2. If 1, dsi2is R3. 
P O0<P<3 

src3 indirect (disp = 0, 1, IRO, IR1) 
src4 indirect (disp = 0, 1, IRO, IR1) 


Te 


Note: 


Only registers RO-R7 are used in parallel instructions. R8—R11 are not used 


in parallel instructions. 
a | 


As in the three-operand addressing mode, indirect addressing in the parallel 
addressing mode allows for displacements of 0 or 1 and the use of the index 
registers (IRO and IR1). The displacement of 1 is implied and is not explicitly 
coded in the instruction word. 


In the encoding shown for this mode in Figure 6-7, if the src3 and src4 fields 
use the same auxiliary register, both addresses are correctly generated, but 
only the value created by the src3 field is saved in the specified auxiliary regis- 
ter. The assembler issues a warning if you specify the same auxiliary register 
src3 and src4. 


6.7.4 Conditional-Branch Addressing Modes 


Instructions using the conditional-branch addressing modes (Bcond, BconaD, 
CALLcond, DBcond, and DBcondD) can perform a variety of conditional oper- 
ations. Bits 31—27 are set to the value of 01101, indicating conditional-branch 
addressing mode instructions. Bit 26 is set to 0 or 1; 0 selects DBcond, and 
1 selects Bcond. Bit 25 determines the conditional-branch addressing mode 
(B). If B = 0, register addressing is used; if B = 1, PC-relative addressing is 
used. Bit 21 sets the type of branch: D = 0 for a standard branch, and D = 1 
for a delayed branch. The condition field(cond) specifies the condition 
checked to determine what action to take — for example, whether or not to 
branch (see Table 14—8 on page 14-14 for alist of condition codes). Figure 6-6 
shows the encoding for conditional-branch addressing. 
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Figure 6—8. Encoding for Conditional-Branch Addressing Modes 
DBcond (D): 


3 26 25 24 22 21 20 16 15 5 4 0 


Bcond (D) 

31 26 25 24 22 21 20 16 15 5 4 0 
0 14101 +40 B 0 oOo 0 D cond 00000000000 src reg 
0 1101 =0 B 0 0 0 D cond Immediate (PC relative) 
CALLcond: 

31 26 25 24 22 21 20 16 15 5 4 0 


6- 


NO 
oO 


Circular Addressing 


6.8 Circular Addressing 


Many DSP algorithms require a circular buffer in memory. In convolution and 
correlation, a circular buffer acts as a sliding window that contains the most 
recent data to be processed. As new data is brought in, the new data over- 
writes the oldest data. The key to using a circular buffer is the implementation 
of a circular addressing mode. This section describes the circular addressing 
mode of the ’C4x. 


The block-size register (BK) specifies the size of the circular buffer. If the most 
significant bit equal to 1 in the BK register is labeled bit N, with N = 15, the 
address immediately following the bottom of the circular buffer can be found 
by concatenating bits 31 through N+1 of a user-selected register (ARn) with 
bits N through 0 of the BK register. The address of the top of the buffer is re- 
ferred to as the effective base (EB) and can be found by concatenating bits 31 
through N+1 of ARn. Bits N through 0 of EB are zero. 


Figure 6-9 illustrates the relationships among the block-size register (BK), the 
auxiliary registers (ARn), the bottom of the circular buffer, the top of the circular 
buffer, and the index into the circular buffer. 


A circular buffer of size R must start on a K-bit boundary (that is, the K LSBs 
of the starting address of the circular buffer must be zeros), where K is an inte- 
ger such that 2K > R. Since the value R must be loaded into the BK register, 
K = N+1. For example, a 31-word circular buffer must start at an address 
whose five LSBs are 0 (that is, xxx...x00000), and the value must be loaded 
into the BK register. 


— SS 0 aoa eel 
Note: 


If the BK register has a value of 0, circular addressing is not performed. The 


effect will be the generation of a conventional linear address. 
|) 
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Figure 6—9. Register Relationships in Circular Addressing 


First 1 at Location N 


Circular 


Addressing 


Algorithm 
Logic 


New 
ARn 
LEGEND: 
ARn = auxiliary register n L = =low-order bits 
BK =block-size register L’ = new low-order bits 
EB = effective base LSB = least significant bit 
H ~~ =high-order bits N  =location of the MSB equal to 1 in the BK register 
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In circular addressing, index refers to the N LSBs of the auxiliary register 
selected, and step is the quantity being added to or subtracted from the 
auxiliary register. When you use circular addressing, follow two basic rules: 


_j The step used must be less than or equal to the block size and is treated 
as an unsigned integer. 


Lj The first time the circular queue is addressed, the auxiliary register must 
be pointing to an element in the circular queue. 


The algorithm for circular addressing is as follows: 


If 0 < index + step < BK: 

index = index + step. 
Else if index + step > BK: 

index = index + step — BK. 
Else if index + step < 0: 

index = index + step + BK. 


Figure 6-10 shows how the circular buffer is implemented. It illustrates the 
relationship of the generated quantities and the elements in the circular buffer. 


Figure 6—10. Circular Buffer Implementation 


Address Data 
Effective 31 N+1 N 0 Top of Circular Buffer 
(eB) ote Element 0 
Element 1 
Auxiliary 
ARH). Element (N LSBs of ARn) 


Last Element 


LSBs BK Last Element + 1 
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Figure 6—11 gives an example of the operation of circular addressing. Assum- 
ing that all registers are four bits, let BK = 01109 (block size of 6) and 
ARO = 0000p (at least the 3 LSBs of ARO should be 0). This example shows 
a sequence of modifications and the resulting value of ARO. It also shows how 
the pointer steps through the circular queue with a variety of step sizes (both 
incrementally and decrementally). 


Figure 6-11. Circular Addressing Example 
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*ARO ++ (5)% ; ARO = 0 (Oth value) 
*ARO ++ (2)% ; ARO = 5 (1st value) 
*ARO-—-(3)% ; ARO = 1 (2nd value) 
*ARO++(6)% ; ARO = 4 _— (8rd value) 
*“ARO--% ; ARO = 4 = (4th value) 
*ARO ; ARO = 3_ (5th value) 
Value Data Address 


: 


Circular Addressing 


Circular addressing is especially useful for the implementation of FIR filters. 
Figure 6—12 shows one possible data structure for FIR filters. Note that the ini- 
tial value of ARO points to h(N—1), and the initial value of AR1 points to x(0). 
Circular addressing is used in the ’C4x code for the FIR filter shown in 


Example 6-22. 


Figure 6—12. Data Structure for FIR Filters 


Impulse Response 


ARO 


Input Samples 


AR1 


Example 6-22. FIR Filter Code Using Circular Addressing 


* Initialization 


LDI N, BK 
DI H, ARO 
DI X, AR1 


Et iA 


TOP LDF IN, R3 
STF R3, *AR1++% 
LDF 0,RO 
LDF 0,R2 


% Filter 


RPTS N-1 


|| ADDF3 RO,R2,R2 
ADDF  RO,R2 


STF R2,Y 
B TOP 


’ 


MPYF3 *ARO++%, *AR1++ 


’ 


’ 


2g 
© 


Load block size. 
Load pointer to impulse response. 


Load pointer to bottom of input 
sample buffer. 


Read input sample. 

Store with other samples. 
and point to top of buffer. 
Initialize RO. 

Initialize R2. 


Repeat next instruction. 
,RO 

Multiply and accumulate. 
Last product accumulated. 


Save result. 


Repeat. 
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6.9 Bit-Reversed Addressing 


The ’C4x can implement fast Fourier transforms (FFT) with bit-reversed ad- 
dressing. If the data to be transformed is in the correct order, the final result 
of the FFT is in bit-reversed order. To recover the frequency-domain data in 
the correct order, certain memory locations must be swapped. The bit-rev- 
ersed addressing mode makes swapping unnecessary. The next time data 
must be accessed, it is accessed in a bit-reversed manner rather than sequen- 
tially. In the ’C4x, this bit-reversed addressing can be implemented with both 
the CPU and DMA. 


For correct CPU (or DMA) bit-reverse operation, the base address of bit-re- 
versed addressing must be located on a boundary of the size of the FFT table. 
The CPU bit-reverse operation can be illustrated by assuming an FFT table 
of size N = 2” When real and imaginary data are stored in separate arrays, 
the n LSBs of the base address must be zero, and IRO must be equal to 2-1 
(half of the FFT size). When real and imaginary data are stored in consecutive 
memory locations (Re-Im—Re—Im) , the n+ 1 LSBs of the base address must 
be zero, and IRO must be equal to 2" (FFT size). 


For CPU bit-reversing, one auxiliary register (AR2 in this case) points to the 
physical location of a data value. When you add IRO to this auxiliary register 
by using bit-reversed addressing, addresses are generated in a bit-reversed 
fashion (reverse carry propagation). The largest index for bit-reversed addres- 
sing is 0008 0000h; this index is treated as an unsigned integer. 


To illustrate bit reversed addressing, assume 8-bit auxiliary registers. Let AR2 
contain the value 0110 00005 (9649). This is the base address of the data in 
memory. Let IRO contain the value 0000 10005 (89). Example 6-23 shows a 
sequence of modifications of AR2 and the resulting values of AR2. 


Example 6-23. Bit-Reversed Addressing Example 
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*AR2++(IRO)B; AR2= 0110 0000 (Oth value) 
*XAR2++(IRO)B; AR2= 0110 1000 (ist value) 
*AR2++(IRO)B; AR2= 0110 0100 (2nd value) 
*AR2++(IRO)B; AR2= 0110 1100 (3rd value) 
*AR2++(IRO)B; AR2= 0110 0010 (4th value) 
*AR2++(IRO)B; AR2= 0110 1010 (5th value) 
*XAR2++(IRO)B; AR2= 0110 0110 (6th value) 
*AR2 ; AR2= 0110 1110 (7th value) 


Bit-Reversed Addressing 


Table 6—4 shows the relationship of the index steps and the four LSBs of AR2. 
You can find the four LSBs by reversing the bit pattern of the steps. 


Table 6—4. Index Steps and Bit-Reversed Addressing 


Bit-Reversed Bit-Reversed 
Step Bit Pattern Pattern Step 
0 0000 0000 0 
1 0001 1000 8 
2 0010 0100 4 
3 0011 1100 12 
4 0100 0010 2 
5 0101 1010 10 
6 0110 0110 6 
7 0111 1110 14 
8 1000 0001 1 
9 1001 1001 9 
10 1010 0101 5 
11 1011 1101 13 
12 1100 0011 3 
13 1101 1011 11 
14 1110 0111 7 


15 11114 1114 15 


Note: 


Bit-reverse operation of the DMA coprocessor is covered in Chapter 11 of 
this user’s guide and in the TMS320C4x General-Purpose Applications 


User’s Guide. 
eee ae Se Ee ee eee eeEE———eEOEO——eE———eEE——————eee——ee———————————— 
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Chapter 7 


Program Flow Control 


The ’C4x provides a complete set of constructs that allow software and hard- 
ware control of the program flow. Software control includes repeats, branches, 
calls, traps, and returns. Hardware control includes interrupts. You can select 
the constructs best suited for your particular application. 
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7.1 Repeat Mode 


The repeat mode of the ’C4x can implement zero-overhead looping. For many 
algorithms, most execution time is spent in an inner kernel of code. Using the 
repeat modes allows these time-critical sections of code to be executed in the 
shortest possible time. 


The ’C4x provides three instructions to support zero-overhead looping: RPTB 
(repeat a block of code), RPTBD (repeat a block of code delayed) and RPTS 
(repeat a single instruction): 


(J RPTB and RBTBD cause a block of code to be repeated a specified num- 
ber of times. 


(1 RPTS causes a single instruction to be repeated a number of times and 
reduces bus traffic by fetching the instruction only once. 


RPTB and RPTS are four-cycle instructions; these four cycles of overhead are 
incurred only on the first pass through the loop. All subsequent passes through 
the loop are accomplished with zero cycles of loop overhead. RPTBD is a one- 
cycle instruction. 


Three registers (RS, RE, and RC) control the updating of the program counter 
when it is updated in a repeat mode, as described in Table 7—1 below. 


Table 7-1. Repeat-Mode Registers 


Register Function 


RS Repeat start address register. Holds the address of the first 
instruction of the code block to be repeated. 


RE Repeat end address register. Holds the address of the last instruc- 
tion of the code block to be repeated. RE should be greater than 
or equal to RS (see subsection 7.1.2). 


RC Repeat-count register. Contains one less than the number of 
times remaining for the code block to be repeated. 


Correct operation of the repeat modes requires that all of the above registers 
and status register fields be initialized correctly. RPTB, RPTBD, and RPTS 
perform this initialization in slightly different ways (see subsection 7.1.3 and 
subsection 7.1.4 for more information). 


Repeat Mode 


7.1.1. Control Bits 
Two bits are important to the operation of RPTB, RPTBD and RPTS: 


(1 The RM (repeat-mode flag) bit in the status register specifies whether or 
not the processor fetches instructions during the repeat mode. 


m If RM=O, fetches are not made in repeat mode. 
m If RM = 1, fetches are made in repeat mode. 


“1 ~The S bit is internal to the processor and cannot be programmed, but this 
bit is necessary to fully describe the operation of RPTB, RPTBD, and 
RPTS. 


m IfRM=1andS=0, RPTBor RPTBD is executing. Program fetches 
occur from memory. 


m If RM =1 and S = 1, RPTS is executing. After the first fetch (from 
memory), program fetches occur from the instruction register (IR). 


7.1.2 Repeat-Mode Operation 


Information in the repeat-mode registers and associated control bits is used 
to control the modification of the PC when instruction fetches are being made 
in repeat mode. The repeat modes compare the contents of the RE register 
(repeat end address register) with the program counter (PC) after the execu- 
tion of each instruction. If they match and the repeat counter is nonnegative, 
the repeat counter is decremented, the PC is loaded with the repeat start ad- 
dress, and processing continues. The fetches and appropriate status bits are 
modified as necessary. Note that the repeat counter (RC) is never modified 
when the repeat-mode flag (RM) is 0. 


The repeat counter should be loaded with a value one less than the number 
of times to execute the block; for example, an RC value of 4 would execute the 
block five times. The detailed algorithm for the update of the PC is shown in 
Example 7-1. 


a eT | 


Notes: 


1) The maximum number of repeats occurs when RC = 8000 OOOOh. This 
results in 8000 0001h repetitions. The minimum number of repeats oc- 
curs when RC = 0. This results in one repetition. 


2) REshould be greater than or equal to RS (RE = RS). Otherwise, the code 
will not repeat even though the RM bit remains set to 1. 


3) By writing a 0 into the repeat counter or writing 0 into the RM bit of the 
status register, you can stop the the loop before it completes. 
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Example 7-1. Repeat-Mode Control Algorithm 
if RM == ;If in repeat mode (RPTB or RPTS) 
if ‘S-== ;If RPTS 


if first time through 
fetch instruction from memory 
else 
fetch instruction from IR 
RC - 1—4 RC 
if RC < 0 


0 — ST (RM) 
0-7 S 
PGU+ "ll —> Pe 


else if S == 


fetch instruction from memory 
if PC == RE 
RE= 1.-—> RC 
if RC 20 
RS — PC 
else if RC < 0 
0 — ST(RM) 
0o- Ss 
PC +1—5 PC 


,;lf this is the first fetch 
;Fetch instruction from memory 
;If not the first fetch 

;Fetch instruction from IR 
;Decrement RC 

;If RC is negative 

;Repeat single mode completed 
;Turn off repeat mode bit 
;Clear S 

; increment PC 

; If RPTB 

;Fetch instruction from memory 
;If this is the end of the block 
;Decrement RC 

;If RC is not negative 

7Set PC to start of block 

;If RC is negative 

;Turn off repeat mode bits 
;Clear S 

; increment PC 


7.1.3. RPTB and RPTBD Instructions 


The RPTB and RPTBD instructions repeat a block of code a specified number 
of times. RPTBD is a delayed form of the RPTB instruction that allows placing 
three instructions after it. These three instructions are not part of the block that 
is repeated, but they execute before the block repeat is started. This way, the 
pipeline remains full, and the RPTBD instruction can execute in one cycle. 


The number of times to repeat the block is the RC (repeat count) register value 
plus one. Because the execution of RPTB and RPTBD does not load the RC, 
you must load this register yourself. The RC register must be loaded before 
the RPTB/RPTBD instruction is executed. The RC register should not be 
loaded in the 3 instructions after RPTBD. Example 7-2 shows a typical setup 
of the block repeat operation. 


Example 7-2. RPTB Operation 
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15,RC ; Load repeat counter with 15 
ENDLOP; Execute the block of code 
STLOOP ;from STLOOP to ENDLOP 16 times 


ENDLOP 


Repeat Mode 


All block repeats initiated by RPTB or RPTBD can be interrupted. However, 
interrupts are disabled during the execution of the three instructions following 
an RPTBD. None of the three instructions after the RPTBD instruction should 
modify the PC register or program flow. This restriction also applies to delayed 
branches, as explained in Section 7.2. 


When RPTB srcor RPTBD src execute, they perform a sequence of four op- 
erations: 


1) Load the start address of the block into RS (repeat start address register). 


m@ For RPTB, this is the next address following the instruction: 
PC of RPTB + 1—RS 


m For RPTBD, this is the fourth address following the instruction: 
PC of RPTBD + 4—RS 


2) Load the end address of the block into RE (repeat end address register). 


m For RPTB, in PC-relative mode, the 24-bit src operand plus RS is the 
end address: 
src + PC of RPTB + 1 — RE 


m For RPTBD, in PC-relative mode, the 24-bit source operand plus RS 
is the end address: 
src + PC of RPTBD + 3 > RE 


3) In register mode, the contents of the src register is the end address: 
contents of src register > RE 


4) Set the status register to indicate the repeat mode of operation. 
1 — RM status register bit (repeat mode flag) 


5) Indicate that this is the repeat block mode of operation. 
0 > S bit (bit is internal to the processor and not programmable) 


7.1.4 RPTS Instruction 


A RPTS src instruction repeats the instruction following the RPTS (src +1) 
times. Repeats of a single instruction initiated by RPTS are not interruptible 
since RPTS fetches the instruction word only once and then keeps it in the 
instruction register for reuse. An interrupt in this situation would cause the 
instruction word to be lost. Refetching the instruction word from the instruction 
register reduces memory accesses and, in effect, acts as a one-word program 
cache. If you need a single instruction that is repeatable and interruptible, you 
can use the RPTB/RPTBD instruction. 
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When RFPTS src is executed, a sequence of five operations occurs: 


1) PC+1—RS 

2) PC +1—RE 

3) 1-— RM (status register bit) 

4) 13S 

5) src— RC (repeat count register) 


The RPTS instruction loads all registers and mode bits necessary for the op- 
eration of the single instruction repeat mode. Step 1 loads the start address 
of the block into RS. Step 2 loads the end address into the RE (end address 
of the block). Since this is a repeat of a single instruction, the start address and 
the end address are the same. Step 3 sets the status register to indicate the 
repeat mode of operation. Step 4 indicates that this is the repeat single-instruc- 
tion mode of operation. Step 5 loads src into RC. 


7.1.5 Repeat Mode Restriction Rules 


Example 7-3. 


7-6 


Because the block repeat modes modify the program counter, other instruc- 
tions cannot modify the program counter at the same time. Two rules apply: 


Rule 1: The last instruction in the block (or the only instruction in a block of size 
one) cannot be a Bcond, DBcond, CALL, CALLcond, TRAP cond, RETIcond, 
RETScond, IDLE, RPTB, or RPTS. Example 7—3 shows an incorrectly placed 
standard branch. 


Rule 2: None of the last four instructions from the bottom of the block (nor the 
only instruction in a block of size one) can be a BcondD, BRD, or DBcondD, 
RPTBD, LAJ, LAJcond, LATcond, BcondAF, BcondAT, or RETIconaD. 
Example 7-4 shows an incorrectly placed delayed branch. 


If either of these rules is violated, the PC will be undefined. 


Incorrectly Placed Standard Branch 


LDI 15,RC ; Load repeat counter with 15 
RPTB ENDLOP ; Execute block of code 
STLOOP ; from STLOOP to ENDLOP 16 times 
ENDLOP BR OOPS ; This branch violates rule 1 


Example 7-4. Incorrectly Placed Delayed Branch 


Repeat Mode 


STLOOP 


ENDLOP 


LDI 
RPTB 


BRD 


ADDE 
MPYE 


15-:RC 


ENDLOP 


OOPS 


SUBE 


Load repeat cou 


from STLOOP to 


; This branch violates rule 2 


Execute block of code 


nter with 15 


ENDLOP 16 times 


7.1.6 RC Register Value After Repeat Mode Completes 


For the RPTB/RPTBD instruction, the RC register normally decrements to 
0000 0000h, unless the block size is 1; in that case, it decrements to 
FFFF FFFFh. However, if the RPTB/RPTBD instruction with a block size of 1 
has a pipeline conflict in the instruction being executed, the RC register decre- 
ments to 0000 0000h. Example 7-5 illustrates a pipeline conflict. Refer to 
Chapter 8 for pipeline information. 


RPTS normally decrements the RC register to FFFF FFFFh. However, if the 
RPTS has a pipeline conflict on the last cycle, the RC register decrements to 
0000 0000h. 


In any case, the number of repetitions is always RC + 1, regardless of the final 
value of RC. 


Example 7-5. Pipeline Conflict in a RPTB Instruction 


-word 40000000h ; 


EDC 
iDP E 
LDI @E 
LDI 15,RC 
RPTB E 
ENDLOP 


Ne Ne Ne Ne Ne Ne Ne 


the instruction 


to FFFF FFFEh 


Program is located in 4000000Fh 


Load repeat counter with 15 
Execute block of code 
The *ARO read conflicts with 


Then RC decrements to 0. If 
cache is enabled, RC decrements 


fetching. 
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7.1.7 Nesting Block Repeats 
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Block repeats (RPTB and RPTBD) are nestable. Because all of the control of 
a block repeat is defined by the RS, RE, RC, and ST registers, these registers 
must be saved and stored to nest block repeats. For example, if you write an 
interrupt service routine that requires the use of RPTB or RPTBD, it is possible 
that the interrupt associated with the routine may occur during another block 
repeat. The interrupt service routine can check the RM bit to determine wheth- 
er the block repeat mode is active. If RM is set, the interrupt routine should 
save ST, RS, RE and RC, in this order. The interrupt routine can then perform 
a block repeat. Before returning from the interrupted routine, the interrupt rou- 
tine should restore RC, RE, RS, and ST, in this order. If the RM bit is not set, 
you do not need to save and restore these registers. 


The RPTS instruction can also be used in a block repeat loop if the proper reg- 
isters are saved. 


Because the program counter is modified at the end of the loop according to 
the contents of registers RS, RE, and RC, no operation should attempt to 
modify the repeat counter or the program counter to a different value at the end 
of the loop. 


It takes four cycles of overhead to save and restore these registers. Hence, 
sometimes, it may be more economical to implement a nested loop by the 
more traditional method of using a register as a counter and then using a 
delayed branch, rather than by using the nested repeat block approach. Often, 
implementing the outer loop as a counter and the inner loop as a RPTB/ 
RPTBD instruction produces the fastest execution. 


ae, | 
Note: 


The order in which the registers are saved/restored is important to guarantee 
correct operation. The ST register should be restored last, after the RC, RE, 
and RS registers. ST should be restored after restoring RC, because the RM 
bit cannot be set to one if the RC register is 0 or —1. For this reason, if you 
execute a POP ST instruction (with ST(RM) = 1) while RC = 0, the POP 
instruction recovers all of the ST register bits except the RM bit, which stays 
at 0 (repeat mode disabled). Also, RS and RE should be correctly set before 


you activate the repeat mode. 
[ee | 


Section 1.7, Repeat Modes, inthe TMS320C 4x General-Purpose Applications 
User’s Guide contains examples of how to use repeat-mode instructions. 


Delayed Branches 


7.2 Delayed Branches 
The ’C4x offers two main types of branches: standard and delayed. 


Standard branches empty the pipeline before performing the branch; this 
guarantees correct management of the program counter and results in a ’C4x 
branch taking four cycles. Included in this class are standard branches 
(Bcona), repeats, calls, returns, and traps. 


Delayed branches do not empty the pipeline but guarantee that the next three 
instructions will execute before the program counter is modified by the branch. 
The result is a branch that requires only a single cycle, thus making the speed 
of the delayed branch very close to speed of the optimal block repeat modes 
of the ’C4x. However, unlike block repeat modes, delayed branches can be 
used in situations other than looping. Every delayed branch has a standard 
branch counterpart that is used when a delayed branch cannot be used. 


Conditional delayed branches use the conditions, reflected in the status regis- 
ter, that existed at the end of the instruction preceding the branch. They do not 
depend upon the instructions following the delayed branch. The execution 
time of a conditional delayed branch instruction is the same regardless of 
whether or not the branch is taken. 


When a delayed branch is fetched, it remains pending until the three instruc- 
tions that follow are executed. None of the three instructions immediately after 
a delayed branch can be any of the following: 


Bcond DBcond LAJ RETScond 
BcondD DBcondD LAJcond RPTB 
BcondAF CALL LATcond RPTBD 
BcondAT CALLcond RETlcond RPTS 

BR IDLE RETIconaD TRAP cond 
BRD 


This restriction also applies to the RPTBD instruction, covered in subsection 
7.1.3. 


Delayed branches disable interrupts until the three instructions following the 
delayed branch are completed. This is independent of whether or not the 


branch is taken. 


Incorrectly used delayed branches can leave the PC undefined. 
Example 7-6 illustrates an incorrectly-placed delayed branch. 


Program Flow Control 7-9 


Delayed Branches 


Example 7-6. Incorrectly Placed Delayed Branches 


B1l:BD Ll 
NOP 
NOP 
B2:B L2 ; This branch is incorrectly placed 
NOP 
NOP 
NOP 


Sometimes, a branch is necessary for the program flow when fewer than three 
instructions can be placed after a delayed branch. For faster execution, it is 
still advantageous to use a delayed branch. This is shown in Example 7-7, 
with a NOP taking the place of the third unused instruction. The tradeoff is 
more instruction words for less execution time. 


Example 7—7.Delayed Branch Execution 


x TITLE DELAYED BRANCH EXECUTION 
LDF *+AR1(5),R2 ; Load contents of memory to R2 
BGED SKIP ; If loaded number >=0, branch 
(delayed) 
LDF R2,R1 ; If loaded number <0, load it to RIL 
SUBF 3.0,R1 ; Subtract 3 from R1 
NOP ; Dummy operation to complete 
delayed branch 
MPYF 1.5,R1 ; Continue here if loaded number < 0 
SKIP LDF R1,R3 ;Continue here if loaded number >=0 


There are two types of delayed branches: branches without annulling and 
branches with annulling. 


7.2.1 Delayed Branches Without Annulling 


Delayed branches without annulling do not empty the pipeline but guarantee 
that the next three instructions execute before the program counter is modified 
by the branch. The delayed branches without annulling are BcondD, BRD, and 
DBconaD. 
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7.2.2. Delayed Branches With Annulling 


Delayed branches with annulling may conditionally annul the next three 
instructions. The delayed branches with annulling are BcondAT and BconaAF: 


a 


BcondAF 


If the condition is true, the BcondAF instruction executes the three instruc- 
tions following the branch and then branches. If the condition is false, the 
processor does not take the branch and it annuls the effects of the execute 
phase of the first following instruction and the effects of the read and 
execute phases of the second and third following instructions. 


BcondAT 


If the condition is true, the BcondAT instruction causes a branch and an- 
nuls the effects of the execute phase of the first following instruction and 
the effects of the read and execute phases of the second and third follow- 
ing instructions. If the condition is false, the instruction causes the execu- 
tion of the three instructions following the branch and does not cause a 
branch. 
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7.3 Calls, Traps, Branches, Jumps, and Returns 
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Calls and traps can execute a subroutine or function while providing a return 
to the calling routine. 


The CALL, CALLcond, and TRAPcond instructions store the value of the PC 
on the stack before changing the PC’s contents. The RETScond or RETIcond 
(standard or delayed) instructions use the value on the stack to return execu- 
tion from traps and calls. 


CALL is a four-cycle instruction, while CALLcond and TRAPcond are five- 
cycle instructions. ’C4x delayed instructions LAJ, LAJcond, and LATcond pro- 
vide equivalent functionality, respectively, but in a single cycle. 


a) 


CALL places the next PC value on the stack and places the src (source) 
operand into the PC. The src is a 24-bit PC-relative or register value. 
Figure 7—1 shows CALL response timing. 


CALLcondiis similar to the CALL instruction (above) except for two differ- 
ences: 


m It executes only if a specific condition is true (the 20 conditions — in- 
cluding unconditional — are listed in Section 14.2 on page 14-12). 


m The srcis either a 24-bit PC-relative displacement or in register ad- 
dressing mode. 


TRAPcond execuies only if a specific condition is true (same conditions 
as for the CALLcondinstruction). When it executes, a four-step sequence 
occurs: 


1) The values of the GIE and CF status register bits are saved into the 
PGIE and PCF status register bits. 


2) Interrupts are disabled (GIE = 0) and the cache is frozen (CF bit = 0). 
3) The next PC value is stored on the stack. 


4) The specified vector is retrieved from the trap vector table and is 
loaded into the PC. The vector address corresponds to a trap number 
in the instruction. 


Using RETIcondor RETIcondD to return re-enables interrupts if the status 
register’s GIE bit was set previously and recovers the previous CF bit. 


RETScond returns execution from any of the above three instructions by 
popping the top of the stack to the PC. For RETScondto execute, the spe- 
cified condition must be true. The conditions are the same for RETScond 
as for the CALLcond instruction. 


Calls, Traps, Branches, Jumps, and Returns 


RETlIcond returns from traps or calls in the same way that RETScond 
does with the addition that RETlcond also copies the PGIE and PCF bit 
values into the GIE and CF bits of the status register. The conditions for 
RETlIcond are the same as for the CALLcond instruction. 


RETIcondaD returns from traps or calls in the same way that RETIcond 
does with the addition that RETIcondD first executes the three instructions 
immediately following RETIcondD. The conditions for RETIconaD are the 
same as for the CALLcond instruction. 


Link and jump (LAJ), link and jump conditional (LAJcond), and link and 
trap conditional (LATcond) each provide a return address in extended- 
precision register R11. 


m After it executes the three instructions that follow it, LAJ jumps to an 
address derived by a 24-bit PC-relative addressing mode (see sub- 
section 6.6 for more information). 


m The LAJcond destination address is either PC-relative (a displace- 
ment) or the contents of a specified register. If the condition is true, 
LAJcond first executes the three instructions following the LAJcond 
before making the jump. If the condition is not true, execution contin- 
ues immediately after the LAJcond instruction. 


mM After it executes the three instructions that follow it, LATcond calls one 
of the 512 available trap vectors pointed to by the trap vector table 
pointer (see Section 3.2, on page 3-17, for more information about the 
TVTP). 


Functionally, calls and traps accomplish the same task—a subfunction is 
called and executed, and control is then returned to the calling function. Traps 
offer two advantages over calls: 


a 


Interrupts are automatically disabled when a trap is executed. This allows 
critical code to execute without risk of being interrupted. Thus, traps are 
usually terminated with a RETIcond or RETIcondD instruction to re-en- 
able interrupts if the status register GIE bit was set previously. 


You can use traps to indirectly call functions. This is particularly beneficial 
when a kernel of code contains the basic subfunctions to be used by ap- 
plications. In this case, you can modify the functions in the kernel and relo- 
cate them without recompiling each application. 


Program Flow Control 7-13 


Calls, Traps, Branches, Jumps, and Returns 


Figure 7-1. CALL Response Timing 


Fetch CALL Decode CALL Read CALL Execute CALL Fetch first 
(store PC subroutine 
| | | | on stack) | instruction | 


ta) - Naf NL NF NS OF 
First instruction 

ADDR Vector address 

Data C inst 1) 
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7.4 Interrupts 


The ’C4x supports multiple internal and external interrupts, which can be used 
for a variety of applications. Internal interrupts are generated by the DMA con- 
troller, the timers, and the communication ports. The five external interrupt pins 
include four external maskable interrupt pins (IIOFO-IIOF3) and one non- 
maskable interrupt (NMI) pin. Interrupts can be sent to both the CPU and the 
DMA controller. 


Interrupts on the ’C4x are automatically prioritized. This allows interrupts that 
occur simultaneously to be serviced in a predefined order. 


This section discusses the operation of these interrupts. Additional information 
regarding internal interrupts can be found in Section 12.6, Coordinating Com- 
munication Ports with the CPU and DMA Processor, on page 12-17, Section 
11.10, DMA and Interrupts, on page 11-42, and Chapter 13, Timers. See Sec- 
tion 7.6, DMA Interrupts, on page 7-26, for more information about interrupts 
to the DMA controller. 


7.4.1 Interrupt Vector Table and Prioritization 


The interrupt vector table (IVT) shown in Figure 7—2 contains the interrupt vec- 
tors. An interrupt vector is an address of an interrupt service routine that should 
start executing when an interrupt is received. The IVT table must be placed on 
a 512-word memory boundary. The table location is determined by the value 
that is stored in the IVTP register (see Section 3.2, CPU Expansion Register 
File, on page 3-17). 


Prioritization means that an interrupt in a higher position in the interrupt vector 
table (Figure 7—2) is serviced before one in a lower position when both are re- 
ceived in the same clock cycle or when two previously received interrupts are 
waiting to be serviced . It does not mean, for example, that IIOF3 must wait 
until service routines for IIOF2, IIOF1, and IIOFO are completed (when 
ST(GIE) = 1). 


The priority of interrupts is handled by the CPU according to the interrupt vec- 
tor table. Priority is set according to position in the table — those with displace- 
ments closest to the I|VTP base address are higher in priority (i.e., NMlis higher 
than TINTO, which is higher than IIOFO, etc.). Note that interrupt TINTO is lo- 
cated at IVTP + 2, while the TINT1 vector is located at IVTP + 2Bh after the 
communication port and DMA coprocessor interrupts. 
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Figure 7—2. Interrupt-Vector Table (IVT) 


Notes: 
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IVTP+ IVTP+ 


000h Reserved Note 1 01Dh ICFULL4 
00th Note 2 O1Eh ICRDY4 
002h TINTO Note 3 01Fh OCRDY4 
003h NOFO 020h OCEMPTY4 
004h NOF1 02th ICFULL5 Note 5 
Note 4 
005h NOF2 022h ICRDY5 
006h NOF3 023h OCRDY5 
007h 024h OCEMPTY5 
: Unused 025h DMA INTO 
00Ch 026h DMA INT1 
00Dh ICFULLO 027h DMA INT2 
Note 6 
OOEh ICRDYO 028h DMA INT3 
00Fh OCRDYO 029h DMA INT4 
010h OCEMPTYO O2Ah DMA INT5 
O1th ICFULL1 02Bh TINT1 Note 3 


NMI 


012h ICRDY1 02Ch Unused 
013h OCRDY1 
014h OCEMPTY1 
Note 5 
015h ICFULL2 
016h ICRDY2 
017h OCRDY2 
018h OCEMPTY2 
019h ICFULL3 
01Ah ICRDY3 
01Bh OCRDY3 03Eh 


Reserved for the reset vector. See Table 7—4. 
NMI (the nonmaskable interrupt) is discussed in subsection 7.4.5. 


Timer interrupts TINTO and TINT1 are enabled by the IIE register (subsection 3.1.9, page 3-11) and monitored at 
the IIF register (subsection 3.1.10, page 3-13). 


External pins NIOFO—IIOFS are programmed in the IIF register (subsection 3.1.10, page 3-13). 

The communication port I/O buffers full/empty/ready interrupts are enabled by the IIE register and are also de- 
scribed in Figure 12-4, on page 12-8, (OUTPUT LEVEL and INPUT LEVEL bits). 

Interrupts from the DMA are enabled at the IIE register and DMA channel control register at bits TCC and AUX TCC 
(see Figure 11-2, on page 11-8, for bit descriptions). 

In the ’C44, the interrupts for communication ports 0 and 3 are active. If you enable them with the IE bit, the ISR 
will be executed. 


Interrupts 


7.4.2 CPU Interrupt Control Bits 
Three CPU registers contain bits used to control CPU interrupt operation: 


(1 The CPU status register (ST). The CPU global interrupt enable bit (GIE), 
located in the ST, controls all maskable CPU interrupts. When this bit is 
set to 1, CPU interrupts are globally enabled. When this bit is cleared to 
0, all CPU interrupts are disabled (except NMI, the nonmaskable inter- 
rupt). Refer to subsection 3.1.7, Status Register (ST), on page 3-5. 


(] Internal interrupt enable register (IIE). The IIE is used to enable CPU 
internally-generated interrupts (from timers, communication ports, and 
DMA channels). See subsection 3.1.9, CPU Internal Interrupt Enable 
Register (IIE), on page 3-11, for more information. 


Cj =HOF flag register (IIF). The IIF contains interrupt flag bits and bits to deter- 
mine the function of the external-interrupt pins (IIOFO — IIOF3). 


The IIF Register 


When an external interrupt or most of the internal interrupts are received, a 
corresponding bit in the IIF register is set to 1. The only internally generated 
interrupts that do not have a flag bit in the IIF register are the communication 
port interrupts. 


When the CPU services an interrupt that has an interrupt flag bit in the IIF regis- 
ter, orwhen the DMA controller latches this type of interrupt into a DMA internal 
signal, this flag bit is cleared by the internal interrupt acknowledge signal. How- 
ever, for level-triggered interrupts, if IIOFn is still low when the interrupt ac- 
knowledge signal occurs, the interrupt flag bit is cleared for only one cycle and 
then set to 1 again. For this reason, it is theoretically possible that, depending 
on when the IIF register is read, the interrupt flag bit may be zero, even though 
IIOFn is low. After reset, zero is written to the interrupt flag register, thereby 
clearing all pending interrupts. 


The IIF register bits can be read or written under software control. This pro- 
vides access to the IIOFx pins, which can be treated as general-purpose I/O 
or as interrupt pins. For example, if at the IIF register, FUNCx = 0 (I/O pin) and 
TYPEx = 1 (output pin), then by writing into the FLAGx bit, you can also write 
to the external pin ITOFx. If FUNCx = 1 (interrupt pin), writing a 1 to the IIF regis- 
ter FLAGx bit has the same effect as an incoming interrupt received on the cor- 
responding pin. In this way, all interrupts can be triggered and/or cleared 
through software. Since the interrupt bits also can be read, the interrupt pins 
can be polled in software when an interrupt-driven interface is not required. 
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Internal interrupts operate in a similar manner. In the IIF register, the bit corre- 
sponding to an internal interrupt (e.g., TINTO, TINT1) can be read and written 
to through software. Writing a 1 sets the interrupt latch, and writing a 0 clears 
it. Allinternal interrupts are one H1/H3 cycle in length. If any previous bit values 
of the IIF register need to be preserved, a modification to IIF should be per- 
formed with logic operations (AND, OR, etc), directly to the IIF register. 


Figure 7-3. lIF Register Modification 


7.4.3 
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correct incorrect 

LDI @MASK, RO LDI IIF, Rl 

AND RO, IIF AND @MASK, R1 
LDI Rl, IIF 


Interrupt Processing 


For an interrupt to occur, at least two conditions must be met: 


(1 Allinterrupts must be enabled globally by setting the GIE bit to 0 inthe CPU 
status register (ST). 


Lj] The interrupt must be enabled by setting the corresponding bit in the IIE 
register. 


The CPU interrupt processing cycle (shown in Figure 7—4) involves several 
events. The corresponding interrupt flag in the IIF register is cleared, the val- 
ues of the GIE and CF status register bits are preserved, the cache is frozen 
(CF = 1), interrupts are globally disabled (GIE = 0), and the CPU completes 
all fetched instructions. Then, the interrupt vector is fetched and loaded into 
the PC, and the CPU continues execution of the first instruction in the interrupt 
service routine (ISR). When you use RETIcond or RETIconaD to return from 
the interrupt service routine, the previous GIE and CF bit values are recovered. 


If you wish to make the interrupt service routine interruptible, you can set the 
GIE bit to 1 after entering the ISR. In addition, you can enable the cache. Be 
aware that because the PGIE and PCF status register bits are one deep, they 
preserve only the previous GIE and CF bits. 


aE SY | 
Note: 


The GIE, and CF are preserved and loaded with new values after the 
completion of the last instruction that was fetched before the interrupt was 


flushed. This guarantees later restoration of correct flag values. 
nd 


Interrupts 


Figure 7—4. CPU Interrupt Processing 


Is an enabled 
interrupt set 


If enabled in the IIE or IIF 
registers, the interrupt Is 
a CPU Interrupt 


v 


Disable interrupts temporarily 


Clear interrupt flag (CPU) 


Complete all fetched instructions 


PC < Interrupt vector 


v 


CPU starts executing ISR routine 


v 


Return executed 
(RETI/RETIcond) 


PGIE —» GIE 
PCF —»® CF 
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CPU interrupts (including NMI) are only acknowledged (responded to by the 
CPU) on instruction fetch boundaries. If instruction fetches are halted because 
of pipeline conflicts or when an RPTS loop is executing, CPU interrupts are not 
acknowledged until the next instruction fetch. 


The interrupt acknowledge (IACK) instruction can be used to signal externally 
that an interrupt has been serviced. If external memory is specified in the oper- 
and, IACK drives the [ACK pin and performs a dummy read. The read is per- 
formed from the address specified by the IACK instruction operand. IACK is 
typically placed in the early portion of an interrupt service routine. However, 
depending on your application, it may be better suited at the end of the interrupt 
service routine or at another location. You are not required to use the [ACK 
instruction in interrupt service routines. 


Note the following situations: 


(1 Interrupts are disabled during a RPTS and during a delayed branch (until 
the 3 instructions following a delayed branch are completed). Interrupts 
are held until after the branch. 


(J When an interrupt occurs, instructions currently in the decode and read 
phases continue regular execution. This is not the case for an instruction 
in the fetch phase: 


m Ifthe interrupt occurs in the first cycle of the fetch of an instruction, the 
fetched instruction is discarded (not executed), and the address of 
that instruction is pushed to the top of the system stack. 


mf the interrupt occurs after the first cycle of the fetch (in the case of a 
multicycle fetch due to wait states), that instruction is executed, and 
the address of the next instruction to be fetched is pushed to the top of 
the system stack. 


m If no program fetch is occurring, then no new fetch is performed. 


7.4.4 CPU Interrupt Latency 
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CPU interrupt latency, defined as the time from the acknowledgement of the 
interrupt to the execution of the first instruction of the interrupt service routine 
(ISR), is at least 8 cycles. This is explained in Table 7—2 where the interrupt 
is treated as an instruction, assuming that all the instructions are single-cycle 
instructions. 


Table 7-2. Interrupt Latency 


Cycle Description Fetch 


1 Recognize interrupt in single-cycle fetched (prog prog a+1 
a+1) instruction. 


2 Temporarily disable interrupt until GIE is cleared. = 
Clear the corresponding IIF flag (if applicable). 


3 Read the interrupt vector table. —_— 


Store return address to stack; save the GIE bitinto — 
PGIE and CF into PCF. Then, clear the GIE bit and 
set the CF bit to 1. 


5 Pipeline begins to fill with ISR instruction. isr1 
6 isr2 
7 isr3 
8 Execute first instruction of interrupt service routine. —_isr4 


7.4.5 External Interrupts 


Decode Read 


prog a prog a—1 


interrupt proga 


—_— interrupt 
isr1 — 

isr2 isr1 

isr3 isr2 


Interrupts 


Execute 


prog a—2 


prog a—1 


prog a 


interrupt 


isr1 


The five external interrupt pins include four external maskable interrupt pins 
(IIOFO-IIOF3) and one nonmaskable interrupt (NMI) pin. 


The four external maskable interrupts (IIOFO-IIOF3) are enabled at the IIF 
register (subsection 3.1.10 page 3-13) and are synchronized internally. They 
are sampled on the falling edge of H1 and passed through a series of H1/H3 
delays internally. Once synchronized, the interrupt input will set the corre- 
sponding interrupt flag register (IIF) bit if the interrupt is active. The list below 
shows the external interrupts and their corresponding interrupt vectors: 


IIOF Pin and Interrupt 
Interrupt Vector Location 
IIOFO IVTP + 003h 
NOF1 IVTP + 004h 
IlOF2 IVTP + 005h 
NOF3 IVTP + 006h 
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NMI 
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These interrupts are prioritized by the selection of one over the other if both 
come on the same clock cycle (IIOFO the highest, IIOF1 next, etc.). When an 
interrupt is taken, the status register ST(GIE) bit is reset to 0, disabling any oth- 
er incoming interrupt (except NMI). This prevents any other interrupt 
(IIOFO-IIOF3) from assuming program control until the ST(GIE) bit is set back 
to 1. In addition, the ST(GIE) bit is saved into ST(PGIE) and the ST(CF) bit into 
ST(PCF). On a return from an interrupt routine, the RET] and RETIcond 
instructions place the value that is in the ST(PGIE) bit into the ST(GIE) bit and 
ST(PCF) bit into the ST(CF) bit, returning them to their previous values. 


External interrupts can be either edge- or level-triggered, depending on how 
the TYPE fields are set in the IIF register (see subsection 3.1.10, IIOF Flag 
Register (IIF), on page 3-13, for more information about the IIF). 


For an edge-triggered interrupt to be detected by the ’C4x, the external pin 
must transition from 1 to 0. And then, it needs to be held low for at least one 
H1/H8 cycle (but it could be held low longer). 


For a level-triggered interrupt to be detected by the ’C4x, the external pin 
needs to be held low for between one and two cycles (1 < low-pulse width < 
2). If the interrupt is held low for more than two cycles, more than one interrupt 
might be recognized. There is no need to provide an edge in this case. 


= >. Sal 
Note: 


Level-triggered interrupts are unlatched. The ’C4x will only detect them if the 
low-level is present during a fetch-to-decode pipeline transition. This means 
that during a pipeline halt, the level-triggered interrupts might be missed 
even if they are held low between one and two cycles. This is not the case 
for an edge-triggered interrupt because they are latched (they will get recog- 


nized regardless if the pipeline is halted). 
| ee 


The nonmaskable interrupt, NMI (an incoming low on pin AJ5, signal NMI), is 
not masked by the ST(GIE) bit. Even though the NMI is nonmaskable, its pro- 
cessing is temporarily postponed during delayed branches and multicycle 
CPU operations. NMI is a negative-going, edge-triggered, latched interrupt. 


Take special care when using an NMI as a second level interrupt. When the 
’C4x services an interrupt, interrupts are disabled except for the NMI. This 
creates a problem because the ST register may end up with the wrong value 
if the NMI is executed before the first level ISR that preserves the ST register’s 
value. 


The TMS320C44 and the TMS320C40 (revision 5.0 and greater) has a soft- 
ware-configurable feature that allows the forcing ready of the internal peripher- 


Interrupts 


al bus when the NMI signal is asserted. This NMI bus-grant feature is enabled 
when bits 18 and 19 in the status register (ST) are set to 105. When enabled, 
a peripheral bus-grant signal is generated on the falling edge of NMI. If NMI 
is asserted and this feature is not enabled, the CPU stalls on an access to the 
peripheral bus if the bus is not ready. A stall condition occurs when writing to 
a full output FIFO or reading an empty input FIFO. This feature is useful in cor- 
recting communication-port errors when used in conjunction with the commu- 
nication-port software-reset feature. 
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7.5 Traps 
A trap is the equivalent of a software-triggered interrupt. In the ’C4x, traps and 
interrupts are treated identically, except in the way in which they are initialized. 
7.5.1 Initialization of Traps and Interrupts 
Traps and interrupts are initialized differently in the ’C4x. 


Traps are always triggered by a software mechanism, by the TRAPcond 
(conditional trap) and LATcond (link and trap conditionally delayed) instruc- 
tions. 


Interrupts are always triggered by hardware events (for example, by external 
interrupts, DMA interrupts, or communication channel interrupts). 


These GIE bit in the ST register and the mask bits in the IIE do not apply to 
traps. 


7.5.2 Operation of Traps 
Figure 7—5 shows the general flow of traps (and also of interrupts). 
Figure 7—5. Flow of Traps 


Trap Executed 
(TRAP cond or LATcond) 


GIE ——_» PGIE 


2 
@) Trap or Interrupt Service Routine 
Vv 


Return Executed 
(RETIcond or RETIconaD) 


(3) PGIE——-» GIE 
PCF ——» CF 


The RETIcond and RETIcondD instructions manipulate the status flags as 
shown in block (3) in Figure 7-5. RETIcona/RETIconaD provides a return/ 
delayed return from a trap or interrupt. 
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In general, you should not directly modify the PGIE or PCF status register bits 
except when putting the status register on a stack for recursive interrupts or 
traps. 


The ’C4x supports 512 different traps. When a TRAPcond n or LATcond n 
instruction is executed, the ’C4x jumps to the address stored in the memory 
location pointed to by TVTP +n, where TVTP is the Trap Vector Table Pointer 
register. The 32-bit TVTP register is essentially the base address for the trap— 
vector table (TVT) in memory. This table, shown in Figure 7—6, contains the 
addresses of the trap service routines that are executed by the CPU. 


Figure 7-6. Trap Vector Table (TVT) 


TVTP + 000h 

TVTP + 001h TRAP1 
to 

TVTP + 1FEh TRAP510 


TVTP + 1FFh TRAP511 


As with the interrupt vector table (IVT), the trap vector table (TVT) must begin 
on a 512-word memory boundary. The TVT pointer register (TVTP) points to 
the beginning of the TVT. See Section 3.2, CPU Expansion Register File, on 
page 3-17, for more information about the TVTP. 


The TRAP or LATcondinstructions can be used to generate a trap and manipu- 
late the status flags as shown in block (1) in Figure 7-5. LATcond (link and trap 
conditionally) provides a single-cycle trap that is very useful for error detection 
and correction. 


SSS SS —_— OO OO — — — — — — — “—Vwvwq«"§“qc———ow“Vw—woaao———Gouoooon“—o“omnnwn—anuauun——n0 ee 


Note: 


Because LATcond is a delayed instruction, the three instructions following 
LATcond should not modify the GIE or CF status register bits (this could re- 
sult in storing incorrect values of these two bits). 


7.5.3 Overlapping the Trap and Interrupt Vector Tables 


The interrupt and trap vector tables can share the same 512-byte space in 
memory. In this configuration, you can place trap vectors where there are no 
interrupt vectors. For example, since interrupt vector 0O2Ch is unused, you 
could place a trap vector at IVTP+02Ch (which is also TVTP+02Ch if the tables 
overlap) and then call that trap by specifying 0O2Ch in the TRAP instruction. 
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7.6 DMA Interrupts 


Interrupts can trigger DMA read and write operations. This is called DMA syn- 
chronization. The DMA interrupt processing cycle is similar to that of the CPU. 
After the pertinent interrupt flag is cleared, the DMA coprocessor proceeds ac- 
cording to the status of the SYNC bits in the DMA coprocessor global control 
register. 


If the interrupt in the DMA Interrupt Enable (DIE) register is enabled, the inter- 
rupt controller automatically latches the interrupt and saves it for future DMA 
use. In the case of the flag interrupts (timer, external interrupt), the IIF flags are 
cleared when the interrupt controller latches the interrupt, not when the DMA 
responds to it. Even if the DMA has not been started, the interrupt latch occurs, 
except when the start bits in the DMA control register have the reset value 009 
in START (AUX START) bits. DMA reset clears the interrupt internal latch. 


7.6.1. DMA Interrupt Control Bits 
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Two registers contain bits used to control DMA interrupt operation: 


Lj DMA interrupt enable register (DIE). All DMA interrupts are controlled by 
bits in the DIE and by the SYNC bits of the DMA channel control registers 
(described in Figure 11-2). The DMA interrupts are not dependent upon 
ST(GIE) and are local to the DMA. 


(J The DMA channel control register. Each DMA coprocessor channel uses 
achannel control register to determine its mode of operation. This register 
is shown in Figure 11-2. 


The DIE is broken into six subfields that determine which interrupts can be 
used to control the synchronization for each of the six DMA channels. For ex- 
ample, the bits in these each of these fields allow you to select whether a DMA 
channel is synchronized to a communication port, a timer, or an external inter- 
rupt pin. 


See subsection 3.1.8, DMA Coprocessor Interrupt Enable Register (DIE), on 
page 3-8, for a description of the DIE. 


DMA Interrupts 


7.6.2 DMA Interrupt Processing 


Figure 7—7 shows the general flow of interrupt processing by the DMA copro- 
cessor. 


Figure 7—7. DMA Interrupt Processing 


Is an Enabled 
Interrupt Set 


If Enabled in the DIE 
register, the interrupt Is 
a DMA Interrupt 


Clear Interrupt Flag 


DMA Proceeds According 
to DMA control register 
SYNC Bits 


DMA Continues 


For more information about DMA interrupts, see Section 11.10, DMA and Inter- 
rupts. 
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7.6.3. CPU/DMA Interrupt Interaction 


The ’C4x DMA coprocessor is not affected by the processing of CPU inter- 
rupts, even when the DMA is using interrupts for synchronization of transfers. 
In addition, the DMA is be affected, even when pipeline fetches are halted. 


The ’C4x allows the CPU and DMA controller to respond to and process inter- 
rupts in parallel. Figure 7-8 shows the sequence of events in interrupt proces- 
sing for both the CPU and the DMA controller; for the exact sequence of 
events, see Table 7-2. 


It is therefore possible to interrupt the CPU and DMA coprocessors simulta- 
neously with the same or different interrupts and, in effect, synchronize their 
activities. However, because the DMA coprocessor and CPU share the same 
set of interrupt flags, in some instances the DMA coprocessor can clear an in- 
terrupt flag before the CPU can respond to it. For example, if CPU interrupts 
are disabled or if instruction fetches have been halted, the DMA can latch the 
interrupt and thus clear the associated interrupt flag. If the interrupt is enabled 
in the DIE register, the CPU will never be able to “steal” a DMA interrupt, be- 
cause the DMA responds to an interrupt as fast as or faster than the CPU. 


Figure 7-8. Parallel CPU and DMA Interrupt Processing 
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Interrupt 
CPU DMA Coprocessor 


Does 
GIE=1 and 
is the interrupt 
enabled in the 
HE or IF 
register? 


Is the 
interrupt enabled 
in the DIE 
register? 


Process the CPU interrupt 
as shown in Figure 7—4. 


Process the DMA interrupt 
as shown in Figure 7—7. 


7.7 Reset 


Reset 


The ’C4x supports a nonmaskable external reset signal (RESET), which is 
used to perform system reset. This section discusses the reset operation. 


After powerup, the state of the ’C4x processor is undefined. You can use the 
RESET signal to put the processor in a known state. This signal must be as- 
serted low for 10 or more H1 clock cycles to guarantee a system reset (See 
Chapter 1, Processor Initialization, in the TMS320C4x General-Purpose Ap- 
plications User’s Guide for the recommended reset circuit). H1 is an output 
clock signal generated by the ’C4x. 


Reset affects several aspects of ’C4x operation: 


J Some device pins 
Li Some device registers 
_j Program execution 


7.7.1. Reset’s Effects on Pin States 


Reset affects the other pins on the device in either a synchronous or an 
asynchronous manner. The synchronous reset is gated by the ’C4x’s internal 
clocks. The asynchronous reset directly affects the pins and is faster than the 
synchronous reset. Reset timing details are included in the ’C4x data sheets. 


Table 7-3 shows the state of the ‘C4x’s pins during RESET = 0 and after RE- 
SET goes back to 1. Each pin is described according to whether the pin is reset 
synchronously or asynchronously. 


Table 7-3. Pin States At System Reset 


(a) Clock (4 pins) 


Signal Pins W/O§ Typet Description 


Begins clocking when RESET makes a 


a : = 1-to-0 transition 


Begins clocking when RESET makes a 


ne : 7 1-to-0 transition 
x1 1 O - No effect 
X2/CLKIN 1 | - No effect 


t A = Asynchronous, S = Synchronous 

+ Recommended decoupling capacitors are one multiple 0.1 uF and 4.7 uF around the device. 
Number depends on specific board noise conditions. 

§ l=Input, O=Output, Z=High-impedance state. 
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Table 7—3. Pin States After System Reset (Continued) 


(b) Communication Port 0 Interface (12 pins) 


Signal 
COD(7—-0) 


CACKO 


CRDYO 
CREQO 


CSTRBO 


Pins 
8 


1 


Vos 
/O 


/O 


/O 
/O 


/O 


Typet Description 


S 


A 


A 


Set to undefined value 


Set high-impedance when reset goes low 
and then set to one when reset goes high 


Set to high-impedance 
Set to high-impedance 


Set high-impedance when reset goes low 
and then set to one when reset goes high 


(c) Communication Port 1 Interface (12 pins) 


Signal 
C1D(7-0) 


CACK1 


CRDY1 
CREQ1 


CSTRB1 


Pins 
8 


Vos 
/O 


/O 


/O 
/O 


/O 


Typet Description 


S) 


A 


A 


Set to undefined value 


Set high-impedance when reset goes low 
and then set to one when reset goes high 


Set to high-impedance 
Set to high-impedance 


Set high-impedance when reset goes low 
and then set to one when reset goes high 


(d) Communication Port 2 Interface (12 pins) 


Signal 
C2D(7-0) 


CACK2 


CRDY2 
CREQ2 


CSTRB2 


Pins 
8 


e}s 
/O 


/O 


/O 
/O 


/O 


Typet Description 


S) 


A 


A 
A 


A 


t A = Asynchronous, S = Synchronous 


+ Recommended decoupling capacitors are one multiple 0.1 wF and 4.7 uF around the device. 


Set to undefined value 


Set high-impedance when reset goes low 
and then set to one when reset goes high 


Set to high-impedance 
Set to high-impedance 


Set high-impedance when reset goes low 
and then set to one when reset goes high 


Number depends on specific board noise conditions. 
§ l=Input, O=Output, Z=High-impedance state. 
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Table 7-3. Pin States After System Reset (Continued) 


(e) Communication Port 3 Interface (12 pins) 


Signal Pins OS Typet Description 

C3D(7-0) 8 V/O S Set to high-impedance 

CACK3 1 V/O A Set to high-impedance 

CRDY3 1 VO A Set high-impedance when reset goes low 
and then set to one when reset goes high 

CREQ3 1 Me) A Set high-impedance when reset goes low 
and then set to one when reset goes high 

CSTRB3 1 /0 A Set to high-impedance 


(f) Communication Port 4 Interface (12 pins) 

Signal Pins W/OS$ Typet Description 
C4D(7-0) 8 VO Ss Set to high-impedance 
CACK4 1 VO A Set to high-impedance 


Set high-impedance when reset goes low 


cabs i ug A and then set to one when reset goes high 


Set high-impedance when reset goes low 


CREQ4 1 vO a and then set to one when reset goes high 


CSTRB4 1 VO A Set to high-impedance 


(g) Communication Port 5 Interface (12 pins) 


Signal Pins IOS Typet Description 
C5D(7-0) 8 /0 iS} Set to high-impedance 


CACK5 1 VO A Set to high-impedance 


Set high-impedance when reset goes low 


CHDNS { ue ‘ and then set to one when reset goes high 


Set high-impedance when reset goes low 


eRe ‘ uD a and then set to one when reset goes high 


CSTRB5 1 VO A Set to high-impedance 
Tt A = Asynchronous, S = Synchronous 
+ Recommended decoupling capacitors are one multiple 0.1 uF and 4.7 uF around the device. 


Number depends on specific board noise conditions. 
§ l=Input, O=Output, Z=High-impedance state. 
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Table 7—3. Pin States After System Reset (Continued) 


(h) Emulation (7 pins) 


Signal Pins JW/OS Typet Description 
EMUO 1 0 - Undefined 
EMU1 1 /0 - Undefined 
TCK 1 | = No effect 
TDI 1 | = No effect 
TDO 1 O = No effect 
TMS 1 | - No effect 
TRST 1 | = No effect 


(i) Global Bus External Interface (80 pins) 


Signal Pins V/0§ Typet Description 

A(30-0) 31 O/Z S Set to high-impedance 
AE 1 I = No effect 

CEO 1 | - _ Noeffect 

CE1 1 - No effect 

D(31-0) 32 VO/Z S Set to high-impedance 
DE 1 - No effect 

LOCK 1 O S Set to one 

PAGEO 1 O/Z S Set to zero 

PAGE1 1 O/Z S Set to zero 

RDYO 1 - No effect 

RDY1 1 - No effect 

R/Wo il O/Z S) Set to one 

RW1 1 O/Z S Set to one 

STAT(3—0) 4 O Ss Set to all ones 

STRBO 1 O/Z S) Set to one 

STRB1 1 O/Z Ss Set to one 


t A = Asynchronous, S = Synchronous 

+ Recommended decoupling capacitors are multiple 0.1 uF and 4.7 uF around the device. Number 
depends on specific board noise conditions. 

§ l=Input, O=Output, Z=High-impedance state. 
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Table 7-3. Pin States After System Reset (Continued) 
(i) Local Bus External Interface (80 pins) 


Signal 
LA(30—0) 
LAE 
LCEo 
LCE1 
LDE 
LLOCK 
LPAGEO 
LPAGE1 
LRDYO 
LRDY1 
LR/W1 
LSTAT(3—0) 
LSTRBO 
LSTRB1 


(k) Interrupts, I/O Flags, Reset, Timer (12 pins) 


Signal 

IACK 

IIOF(0-3) 

NMI 

RESET 

RESETLOC(1,0) 

ROMEN 
LD(31-0) 

TCLKO 

TCLK1 


t A = Asynchronous, S = Synchronous 


Pins 
31 
1 
1 
1 
1 


vos 
O/Z 


Pins 
1 
4 


1 
32 


/0§ 


/O 


Typet 


‘S) 


nnn w 


Reset 


Description 
Placed in high-impedance state 
Reset has no effect 
Reset has no effect 
Reset has no effect 
Reset has no effect 
Set to one 

Set to zero 

Set to zero 

Reset has no effect 
Reset has no effect 
Set to one 

Set to all ones 

Set to one 


Set to one 


Typet Description 


S 
A 


A 
A 


Set to one 

Set to high-impedance 
No effect 

RESET input pin 

No effect 

No effect 

Set to high-impedance 
Set to high-impedance 


Set to high-impedance 


+ Recommended decoupling capacitors are one multiple 0.1 uF and 4.7 uF around the device. 


Number depends on specific board noise conditions. 


§ l=Input, O=Output, Z=High-impedance state. 
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Table 7—3. Pin States After System Reset (Continued) 
(I) Power (70 pins) 


Signal Pins IOS Typet Description 
SUBS 1 | = Substrate pin (tie to ground). Set to 
high-impedance. 
Vss_ 4 | - Ground pins. Set to high-impedance. 
CVss 15 | — Ground pins. Set to high-impedance. 
DVss 15 | — Ground pins. Set to high-impedance. 
IVss 6 | = Ground pins. Set to high-impedance. 
DVpp 13 | _ +5Vpc supply pins. Set to high-imped- 
ance.+ 
GADV 3 _ +5Vpc supply pins. Set to high-imped- 
DD ance.+ 
GDDV 3 m +5Vpc supply pins. Set to high-imped- 
BP ance. 
LADV 3 " +5Vpc supply pins. Set high-imped- 
DD ance.+ 
LDDV 3 = +5Vpc supply pins. Set to high-imped- 
DD ance.+ 
Vepk 4 | 7 +5Vpc supply pins. Set to high-imped- 


ance.t 


Tt A = Asynchronous, S = Synchronous 

+ Recommended decoupling capacitors are one multiple 0.1 uF and 4.7 uF around the device. 
Number depends on specific board noise conditions. 

§ |=Input, O=Output, Z=High-impedance state. 
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7.7.2 Reset Vector Location 


When RESET is released, the ’C4x begins executing the application program. 
The initial address of the program is stored in the reset vector. The 'C4x per- 
mits selection of any one of four reset vector locations. Selection of the reset 
vector location that is used is determined by the levels on the RESETLOC1 
and RESETLOCO pins at reset. Table 7—4 shows the possible configurations 


of 


these pins. 


Table 7-4. RESET Vector Locations 


Value at RESETLOCx Pin Get Reset Vector From 


RESETLOC1 RESETLOCO Hex Memory Address Comment 


0 0 00000 0000 Local Bus 
0 1 O7FFF FFFFT Local Bus 
1 0 08000 o000t Global Bus 
1 1 OFFFF FFFFt Global Bus 


t This corresponds to the 32-bit address that the processor accesses. However, in the ‘C44 only 
the 24-LSBs of the reset address will be driven on pins AO—A23 or pins LAOQ—LA23. The corre- 
sponding LSTRBx pins will also be activated. 


7.7.3 Additional Reset Operations 


After system reset (after RESET goes back from 0 to 1), the following addition- 


al 


a) 


operations are performed: 


Timer registers are set. 


m= Thetimer global control register is set to 0, except that bit DATIN is set 
to the value on pin TCLK. 


m The timer counter and timer period registers set to zeros. 


Control registers for communication ports 0—2 (subsection 12.3.1 on page 
12-8) are set to zeros (output operation), and control registers for commu- 
nication ports 3—5 are set to 04h (input operation). 


External memory interface control registers (Section 9.3 on page 9-6) are 
set to 3E39 FFFOh. (7 wait states) 


DMA channel control register, DMA transfer counter, and DMA auxiliary 
transfer counter (Subsection 11.3.1 on page 11-7) are set to zeros. 
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“1 The following CPU registers are loaded with zeros (each described in 
Chapter 3): 


IIE (CPU internal interrupt enable register) 
IIF (interrupt flag register) 

DIE (DMA internal enable register) 

IVTP (interrupt-vector table pointer) 

TVTP (trap-vector table pointer) 


[1 The CPU status register (ST) is set to 0400h, which puts the on-chip cache 
in cache freeze mode. 


Li The reset vector is read from its location and loaded into the PC. 


CL) If ROMEN=1 (Internal ROM enabled), the RESETLOC(1,0) pins are low, 
and the IIOFO pin is high, the ’C4x will start execution of the bootloader 
code. Otherwise, the ’C4x will start execution of the routine which is 
pointed to by the reset vector corresponding to the RESETLOC(1,0) pins. 


Multiple ‘C4xs driven by the same system clock can be reset and synchro- 
nized. See Reset Signal Generation in the TMS320C4x General-Purpose 
User’s Guide for information about resetting multiple ’C4xs. 
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Pipeline Operation 


Two characteristics of the ’'C4x that contribute to its high performance are pipe- 
lining and concurrent I/O and CPU operation. 


Four functional units control ’'C4x pipeline operation: fetch, decode, read, and 
execute. Pipelining is the overlapping or parallel operations of the fetch, de- 
code, read, and execute levels of a basic instruction. 


The DMA coprocessor decreases pipeline interference and enhances the 
CPU’s computational throughput by performing input/output operations. 
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8.1 Pipeline Structure 
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The four major units of the ’C4x pipeline structure and their functions are as 
follows: 


Fetch Unit (F) Fetches the instruction words from memory 
and updates the program counter (PC). 


Decode Unit (D) Decodes the instruction word and performs ad- 
dress generation. Also, controls modification of 
the ARn registers in the indirect addressing 
mode, and of the stack pointer when PUSH to/ 
POP from the stack occurs. 


Read Unit (R) If required, reads the operands from memory. 


Execute Unit (E) If required, reads the operands from the regis- 
ter file, performs the necessary operation, and 
writes results to the register file. If required, re- 
sults of previous operations are written to 
memory. 


A basic instruction has four levels: fetch, decode, read, and execute. 
Figure 8-1 illustrates these four levels of the pipeline structure. The levels are 
indexed according to instruction and execution cycle. In the figure, perfect 
overlap in the pipeline, where all four units operate in parallel, occurs at cycle 
(m). Levels about to be executed are at m+1, and those just executed are at 
m-—1. The ’C4x pipeline controller supports a high-speed processing rate of 
one execution per cycle. It also manages pipeline conflicts so that they are 
transparent to the user. You do not need to take any special precautions to 
guarantee correct operation. 


Pipeline Structure 


Figure 8—1. Pipeline Structure 


CYCLE | Fetch Decode Read Execute 
m-3 W iF = = 
m-2 x wW = = 
m-1 Y x w - 
m Zz Y x w —+— Perfect overlap 
m+1 7 Zz Y x 
m+2 - = Z ¥ 
m+3 - = = Z 


Notes: 1) W, X, Y, and Z represent instructions. 
2) F, D, R, E = fetch, decode, read, and execute, respectively. 


Priorities from highest to lowest have been assigned to each of the functional 
units of the pipeline and to the DMA controller as follows: 


DMA (if configured as highest priority) 
Execute 

Read 

Decode 

Fetch 

DMA (if configured as lowest priority). 


UOOUUOUCU 


When the processing of an instruction is ready to pass to the next higher pipe- 
line level and that level is not ready to accept a new input, a pipeline conflict 
occurs. In this case, the lower priority unit waits until the higher priority unit 
completes its currently executing function. 
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8.2 Pipeline Conflicts 


Pipeline conflicts in the ’C4x can be grouped into the following three main cate- 
gories: 


Branch Conflicts Involve most of those instructions or operations that 
read and/or modify the PC. 


Register Conflicts Involve delays that can occur when reading from or writ- 
ing to registers that are used for address generation, 
such as: ARO-AR7, IRO, IR1, BK, DP and SP. 


Memory Conflicts Occur when the internal units of the ’C4x compete for 
memory resources. 


Each of these three types is discussed in the following subsections. Examples 
are included. Note in these examples, when data is refetched or an operation 
is repeated, the symbol representing the stage of the pipeline is appended with 
anumber. For example, if a fetch is performed again, the instruction mnemonic 
is repeated. The symbol RDY is used to indicate that a unit is not ready and 
the symbol RDY is used to indicate that a unit is ready. 


8.2.1. Branch Conflicts 


Branch conflicts involve most of the instructions or operations that read and/or 
modify the PC. 


8.2.1.1 Standard Branches 
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Pipeline conflicts occur with standard (nondelayed) branches, i.e., BR, Bcond, 
DBcond, CALL, IDLE, RPTB, RPTS, RETIcond, RETScond, interrupts, and 
reset, because their execution is all the pipeline can handle. Other information 
fetched into the pipeline is discarded or refetched, or the pipeline becomes in- 
active; this is referred to as flushing the pipeline. Flushing the pipeline is neces- 
sary in these cases to prevent partial execution of succeeding instructions. 
The branches discussed here are loads; TRAP cond and CALLcond are 
treated as conditional stores and are shown in Example 8-13. 


Example 8-1 shows the code and pipeline operation for a standard branch. 
Note that one dummy fetch is performed (MPYF instruction), and then after the 
branch address is available, a new fetch (OR instruction) is performed. This 
dummy fetch introduces the MPYF instruction into the cache. 
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Example 8—1.Standard Branch 


BR THREE ; Unconditional branch 
MPYF 5 Not executed 
ADD : Not executed 
SUBF 5 Not executed 
AND . Not executed 
THREE OR ; Fetched after BR is taken 
STI 
PIPELINE OPERATION 
PC Fetch Decode Read Execute 
n BR _ - - 
n+1 MPYF BR - - 
~~ Fetch held for 
n+1 (nop) (nop) BE new PC value 
4+ ks 
n+1 (nop) __| (nop) (nop) BE THREE — PC 
+ 
THREE OR (nop) (nop) (nop) 


STI OR (nop) (nop) 


Note: 


Both RPTS and RPTB flush the pipeline, allowing the RS, RE, and RC regis- 
ters to be loaded at the proper time. If these registers are loaded without the 
use of RPTS or RPTB, no flushing of the pipeline occurs. Thus, RS, RE, and 
RC can be used as general-purpose 32-bit registers without pipeline con- 
flicts. When RPTB is nested because of nested interrupts, it may be neces- 
sary to load and store these registers directly while using the repeat modes. 
Since up to four instructions can be fetched before the repeat mode is en- 
tered, loads should be followed by a branch to flush the pipeline. If the RC 
is changing when an instruction is loading it, the direct load takes priority over 
the modification made by the repeat mode logic. 
eee eee ee 
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8.2.1.2 Delayed Branches Without Annul Option 


Delayed branches are implemented to assure that the next three instructions 
are fetched and executed. The delayed branches without annul option include 
BRD, BconaD, and DBconaD. Example 8-2 shows the code and pipeline op- 
eration for a delayed branch. 


Example 8—2.Delayed Branch Without Annul Option 


BRD THREE ; Unconditional delayed branch 
MPYF : Executed 
ADD 4 Executed 
SUBF ; Executed 
AND ; ot executed 
THREE MPYE : Fetched after SUBF is fetched 
PIPELINE OPERATION 
PC Fetch Decode | Read | Execute 
n BRD - - - 
n+1 MPYF BRD - ~ No execute delay 
n+2 ADDF MP YF BRD — 
n+3 SUBF ADDF MPYF BRD THREE —> PC 
THREE MPYF SUBF ADDF MPYF 
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8.2.1.3 Delayed Branches With Annul Option 


The ’C4x supports delayed branches with an annulling option: BcondAT 
(branch conditional, annul if true) and BcondAF (branch conditional, annul if 
false). The true or false status of the condition controls whether or nota branch 
is performed (as in a delayed branch). The annulling operation cancels the 
effect of the execute phase of the first instruction and of the read and execute 
phases of the second and third instructions following the BcondAT or Bcon- 
dAF. 


(1 If the condition is true, BcondAT performs a branch, and the annulling 
operation takes place. Otherwise, the branch is not taken and the annul- 
ling operation does not take place. 


(1 If the condition is false, BcondAF does not perform a branch, and the 
annulling operation takes place. Otherwise, the branch is taken and the 
annulling operation does not take place. 


See subsection 7.2.2 for more information about delayed branches with annul- 
ling. Example 8-3 uses both BcondAT and BcondAF. 


Example 8-3. Using BcondAF and BcondAT Instructions 


LDI *AR1,RO 

BNAT bottom H If negative, branch and 

ADDI *++AR2,R3 ; annul th xecute phas 

MPYF ; of ADDI, MPYF, and NOT. 

NOT 4 Otherwise, don’t annul and 
Ops SUBF ; continue with SUBF. 

SUBI 1,R0 

BNNAF top ; If not negative, branch and 

ADDI *++AR2,R3 ; do not annul the execute 

MPYF : phase of ADDI, MPYF, and 

NOT ; NOT. Otherwise, annul ADDI, 
bottom: XOR ; MPYF, and NOT, and continue 

; with XOR. 


At the start of Example 8-3, if the result of the load is negative (a true 
condition), the BcondAT instruction causes a branch and also annuls the 
execute phase of the three instructions that follow it. As a result, the execute 
phase of the ADDI instruction does not occur, and register R3 is not updated 
by addition. However, AR2 is incremented, and data at the corresponding 
address is read because these operations are in the decode and read phases 
of the pipeline, respectively, and thus cannot be annulled. 
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Two types of operations can be annulled: 


_j All writes to the register file that occur in the execute phase (ADDs, LDs, 
etc., but not LDA, LDPK, etc.) 


Lj All stores to memory 


8.2.2 Register Conflicts 


A register conflict occurs if you read from or write to a register used for ad- 
dressing purposes (ARO—AR7, IRO, IR1, BK, DP, and SP) when the register 
is not ready to be used. For example, if an instruction writes to one of these 
registers, the decode unit cannot use that same register until the write is com- 
plete (which occurs in the execute stage). 


In Example 8—4, an auxiliary register is loaded, and the same auxiliary register 
is used on the next instruction. Since the decode stage needs the result of the 
write to the auxiliary register, the decode of this second instruction is delayed 
two cycles. Every time the decode is delayed, a refetch of the program word 
is performed; i.e., ADDF is fetched three times. Because these are actual re- 
fetches, they can cause not only conflicts with the DMA controller, but also 
cache hits and misses. If the AR register used in the MPYF instruction were 
different from the one used in the LDI instruction, no delay would occur. 


Example 8—4. Write to an AR Followed by an AR for Address Generation 


LDI 7,AR2 ; 7 2 AR2 
MPYF *AR2,RO ; Decode delayed 2 cycles 


ADDF 
FLOAT 


PIPELINE OPERATION 


PC Fetch Decode | Read | Execute 
n LDI - _ - — 
Decode/address 
a generation held 
MPYE LDI - - 
bel for a new AR value 
n+2 ADDF MPYF LDL - 
n+2 ADDF MPYF (nop) LDI 7,AR2  AR2 loaded 
n+2 ADDF MPYF (nop) (nop) 
n+3 FLOAT ADDF MPYF (nop) 
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Conflicts involving reads are similar to those involving writes. If an instruction 
must read registers ARO—AR7 or SP, the use of those particular registers by 
the decode stage for the following instruction is delayed until the read is com- 
plete. The registers are read at the start of the execute cycle and therefore re- 
quire only a one-cycle delay of the following decode. For four registers (IRO, 
IR1, BK, or DP), no delay is incurred upon a read. 


In Example 8-5, two auxiliary registers are added together with the result go- 
ing to an extended-precision register. The next instruction uses one of the 
same auxiliary registers as an address register. If the MPYF instruction used 
an AR register other than ARO or AR2, no delay would occur. 


Example 8—5.A Read of ARs Followed by ARs for Address Generation 


ADDI ARO,AR2,Rl1 ; ARO + AR2 > RI 
MPYF *++AR2, RO ; Decode delayed 1 cycle 


ADDF 
FLOAT 


PIPELINE OPERATION 


Decode/address 
generation held 
until AR is read 


PC Fetch | Decode | Read | Execute 


n ADDI - = 
n+1 MPYF ADDI 
n+2 ADDF MPYF ADDI a 

ARs read 
n+2 ADDF MPYF (nop) ADDI 
n+3 FLOAT ADDF MPYF (nop) 
Note: 


The DBR (decrement and branch) instruction’s use of auxiliary registers for 
loop counters is treated the same as if the use were for addressing. There- 
fore, the operation shown in the two previous examples can also occur for 
this instruction. 


a) 
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8.2.3. Memory Conflicts 


8.2.3.1 Program Wait 


Memory conflicts can occur when the memory bandwidth of a physical 
memory space is exceeded. RAM blocks 0 and 1 and the ROM block can sup- 
port only two accesses per cycle. The external interface can support only one 
access per cycle. Some conditions under which memory conflicts can be 
avoided are discussed in Section 8.3, on page 8-17. 


Memory pipeline conflicts consist of the following four types: 


Program Wait A program fetch is prevented from beginning. 

Program Fetch Incomplete A program fetch has begun but is not yet com- 
plete. 

Execute Only An instruction sequence requires three CPU 


data accesses in a single cycle. 


Hold Everything A global or local bus operation must complete 
before another one can proceed. 


These four types of memory conflicts are illustrated in examples and dis- 
cussed in the paragraphs that follow. 


Two conditions can delay an instruction fetch: 


(1 Toomany accesses to the same memory atthe start of a CPU data access 
can occur in two cases: 


m Two CPU data accesses are made to an internal RAM or ROM block, 
and a program fetch from the same block is necessary. 


m@ One ofthe external ports is starting a CPU data access, andaprogram 
fetch from the same port is necessary. 


Li Amulticycle CPU data access or DMA data access over the external bus 
is needed. 


Pipeline Conflicts 


Example 8-6 illustrates a program wait until a CPU data access completes. 
In this case, *ARO and *AR1 are both pointing to data in RAM block 0, and the 
MPYF instruction will be fetched from RAM block 0. This results in the conflict 
shown. Since no more than two accesses can be made to RAM block 0 ina 
single cycle, the program fetch cannot begin and must wait until the CPU data 
accesses are complete. 


Example 8—-6.Program Wait Until CPU Data Access Completes 


ADDF3 *ARO, *AR1,RO 
FIX 
MPYF 
ADDF3 
NEGB 
PIPELINE OPERATION 
PC Fetch | Decode | Read | Execute | Fetch held until 
ARs are read 
n ADDF3 = — = a 
ARs read 
n+1 FIX ADDF3 a 
n+2 (wait) FIX ADDF3 - 
n+2 MPYF (nop) FIX ADDF3 
n+3 ADDF3 MPYF (nop) FIX 
n+4 NEGB ADDF3 MP YF (nop) 


Example 8—7 shows a program wait due to a multicycle data-data access or 
amulticycle DMA access. The ADDF, MPYF, and SUBF are fetched from some 
portion in memory other than the external port the DMA requires. The DMA be- 
gins a multicycle access. The program fetch corresponding to the CALL is 
made to the same external port that the DMA is using. 


Even if the DMA is configured as the lowest priority, a multicycle access can- 
not be aborted. The program fetch must therefore wait until the DMA access 
completes. 
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Example 8—7.Program Wait Due to Multicycle Access 


n+4 


PIPELINE OPERATION 


Fetch 


ADDF 


MPYF 


SUBF 


| Decode | 


(wait) 


CALL 


8.2.3.2 Program Fetch Incomplete 


Read | Execute 
ADDF - 
2-cycle DMA access 
MPYF ADDF v 
SUBF MPYF 
(nop) SUBF 


A program fetch incomplete occurs when an instruction fetch takes more than 
one cycle to complete because of wait states. In Example 8-8, the MPYF and 
ADDF are fetched from memory that supports single-cycle accesses. The 
SUBF is fetched from memory requiring one wait state. One example that 
demonstrates this conflict is a fetch across a bank boundary on the external 


port. 


Example 8—8.Multicycle Program Memory Fetches 


PC 

n 

n+1 

n+2 RDY 
n+2 RDY 


n+3 


PIPELINE OPERATION 


Fetch 


MPYF 


ADDF 


SUBF 


SUBF 


ADDI 


| Decode | Read 
MPYF = 
ADDF MP YF 
(nop) ADDF 
SUBF (nop) 


| Execute 


al 


1 wait state required 


3 
: 
ke 
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8.2.3.3 Execute Only 


The Execute Only type of memory pipeline conflict occurs when a sequence 
of instructions requires three CPU data accesses in a single cycle. There are 
two cases in which this occurs: 


[1 An instruction performs a store and is followed by an instruction that per- 
forms two memory reads. 


[1 An instruction performs two stores and is followed by an instruction that 
performs at least one memory read. 


The first case is shown in Example 8-9. Since this sequence requires three 
data memory accesses and only two are available, only the execute phase of 
the pipeline is allowed to proceed. The dual reads required by the LDF || LDF 
are delayed one cycle. Note that in this case a refetch of the next instruction 
can occur, which could cause an additional access to memory. 


Example 8-9. Single Store Followed by Two Reads 


STFRO,*AR1 ; RO —> *ARI 
LDF *AR2,R1 ; *AR2 —R1 in parallel with 
|| LDF*AR3,R2 ; *AR3 OR2 


PIPELINE OPERATION 


PC | Fetch | Decode | Read | Execute 

n STF = = 

n+1 LDF ||LDF STF = _ 

n+2 wW LDF ||LDF STF = Write must 

complete 

n+3 X W LDF||LDF STF before the 
2 reads can 
complete. 

n+4 x wW LDF ||LDF (nop) 

n+4 Y x wW LDF ||LDF 
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Example 8-10 shows a parallel store followed by a single load or read. Since 
two parallel stores are required, the next CPU data memory read must wait one 
cycle before beginning. One program memory refetch may occur. 


Example 8-10. Parallel Store Followed by Single Read 
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F RO, *ARO - RO — *ARO in parallel with 
STE R2,*ARl1 j; R2 —> *ARI 
ADDF @SUM,R1 ; R1 + @SUM >RI1 


TACK 
ASH 


PIPELINE OPERATION 


PC | Fetch | Decode | Read | Execute | 
n stF ||STF ie 2 ce 
—— Read must wait 
n+1 ADDF _— STF ||STF 3 until the writes __ 
are complete 
n+2 IAC ADDF sTF ||STF = 
n+3 ASH IACK ADDF sTF ||STF 
n+4 ASH IACK ADDF (nop) 
n+4 - ASH IACK ADDF 


Pipeline Conflicts 


8.2.3.4 Hold Everything 
Three types of conditions cause Hold Everything memory pipeline conflicts: 


[J A CPU data load or store cannot be performed because an external port 
is busy. 


_j An external load takes more than one cycle. 


Lj The execution of conditional calls and traps, which take one more cycle 
than conditional branches. 


The first type of Hold Everything conflict occurs when one of the external ports 
is busy because of an access that has started but is not complete. In 
Example 8-11, the first store is a two-cycle store. The CPU writes the data to 
an external port. The port control then takes two cycles to complete the data- 
data write. The LDF is a read over the same external port. Since the store is 
not complete, the CPU continues to attempt processing the LDF until the port 
is available. 


Example 8-11. Busy External Port 


STF RO, @DMA1 
LDF @DMA2, RO 


PIPELINE OPERATION 


PC Fetch | Decode | Read | Execute 
n STF - - = 
n+1 LDF STF - a 
n+2 Ww LDF STF - 
n+2 Ww LDF (nop) STF zy 
2-cycle external bus 
n+2 Ww LDF (nop) (nop) v write access 
n+3 x W LDF (nop) 
n+4 Y x w LDF 


The second type of Hold Everything conflict involves multicycle data reads. In 
this case, the read has begun and continues until completed. In 
Example 8-12, the LDF is performed from an external memory that requires 
several cycles to access. 
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Example 8-12. Multicycle Data Reads 


LDF @DMA, RO 


PIPELINE OPERATION 


PC Fetch | Decode | Read | Execute 
n LDF 7 = = 
n+1 I LDF - - 
n+2 J I LDF = 4 
2-cycle external bus 
n+3 — -K (dummy) I LDF = i read access 
n+3 Kp J I LDF 


The final type of Hold Everything conflict deals with conditional calls 
(CALLconda) and traps (TRAP cond), which are different from other branch in- 
structions. Whereas other branch instructions are conditional loads, the condi- 
tional calls and traps are conditional stores, which take one more cycle to com- 
plete than conditional branches (see Example 8-13). The added cycle pushes 
the return address after the call condition is evaluated. 


Example 8-13. Conditional Calls and Traps 


PIPELINE OPERATION 


PC Fetch Decode | Read | Execute 
n CALLcond = _ — 
n+1 I CALLcond - - 
n+1 (nop) (nop) CALLcond = 
n+1 (nop) (nop) (nop) CALLcond 
PC store 
n+1 (nop) (nop) (nop) CALLcond cycle 
v 
n+2/CALLaddr ar (ndp) (nop) (nop) 
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8.3. Memory Accesses for Maximum Performance 


If program fetches and data accesses are performed in such a manner that the 
resources being used cannot provide the necessary bandwidth, the pipeline 
is stalled until the accesses are complete. Certain configurations of program 
fetch and data accesses yield conditions under which the ’C4x can achieve 
maximum throughput. 


Table 8-1 shows how many accesses can be performed from the different 
memory spaces when it is necessary to do a program fetch and a single data 
access, and still achieve maximum performance (one cycle). Four cases 
achieve one-cycle maximization. 


Table 8—1.One Program Fetch and One Data Access for Maximum Performance 


Accesses From Local Bus 
Case Global Bus Dual-Access Or Peripheral 
No. Accesses Internal Memory Accesses 
1 1 1 — 
2 1 — 1 
3 _ 2 from any combination 
of internal memory 
4 — 1 1 
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Table 8—2 shows how many accesses can be performed from the different 
memory spaces when it is necessary to do a program fetch and two data ac- 
cesses, still achieving maximum performance (one cycle). Six cases achieve 
this maximization. 


Table 8-2. One Program Fetch and Two Data Accesses for Maximum Performance 


8 
9 


Global Bus 
Accesses 


{ 


1 program 
1 data 
1 data 


1 program 
1 DMA 


Accesses From Dual-Access 


Internal Memory 


2 from any combination of internal 


memory 
1 data 
1 data 
1 program, 1 data 


2 from same internal memory block 
and1 from a different internal memory 


block 


3 from different internal memory blocks 


2 from any combination of internal 


memory 
2 data 
2 data 


Local Or 
Peripheral 
Bus 
Accesses 


1 data 
1 program 
1 DMA 


1 DMA 


1 DMA 
1 program 


t For Cases 2 and 3, see Three-Operand Instruction Memory Reads on page 8-20. 
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8.4 Clocking of Memory Accesses 


This section discusses the role of internal clock phases (H1 and H8) in the way 
the ’'C4x handles multiple memory accesses. Whereas the previous section 
discussed the interaction between sequences of instructions, this section dis- 
cusses the flow of data on an individual instruction basis. 


Each major clock period of 40 ns is composed of two minor clock periods of 
20 ns, labeled H3 and H1 (these times assume a 50-MHz ’C40). The active 
clock period for H3 and H1 is the time when that signal is high. 


¢— Major Clock Period —| 


H1 


H3 


The precise operation of memory reads and writes can be defined according 
to these minor clock periods. The types of memory operations that can occur 
are program fetches, data loads and stores, and DMA accesses. Internal DMA 
data accesses always start during the H3 cycle. 


8.4.1. Program Fetches 


Internal program fetches are always performed during H3 unless a single data 
store must occur at the same time because of another instruction in the pipe- 
line. Inthat case, the program fetch occurs during H1 and the data store occurs 
during H3. 


External program fetches always start at the beginning of H3 with the address 
being presented on the external bus. At the end of H1, the fetches are com- 
pleted with the latching of the instruction word. 
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8.4.2 Data Loads and Stores 


Four types of instructions perform loads, memory reads, and stores: two-oper- 
and instructions, three-operand instructions, multiplier/ALU operation with 
store instructions, and parallel multiply and add instructions. See Chapter 6 for 
detailed information on addressing modes. 


As discussed in Chapter 9, the number of bus cycles for external memory 
accesses differs in some cases from the number of CPU execution cycles. For 
external reads, the number of bus cycles and CPU execution cycles is identi- 
cal. For external writes, there are always at least two bus cycles, but unless 
there is a port access conflict, there is only one CPU execution cycle. In the 
following examples, any difference in the number of bus cycles and CPU 
cycles is noted. 


8.4.2.1. Two-Operand Instruction Memory Accesses 


Figure 8—2. Two-Operand Instruction Word 


31 24 23 16 15 8 7 0 
SECS) 


Two-operand instructions include all those instructions with bits 31-29 being 
000s or 0105 (see Figure 8—2). In the case of a data read, bits 15—0 represent 
the src operand. Internal data reads are always performed during H1. External 
data reads always start at the beginning of H3 with the address presented on 
the external bus, and they complete with the latching of the data word at the 
end of H1. 


In the case of a data store, bits 15-0 represent the dst operand. Internal data 
stores are performed during H3. External data stores always start at the 
beginning of H3 with the address and data presented on the external bus. 


8.4.2.2 Three-Operand Instruction Memory Reads 


Figure 8-3. Three-Operand Instruction Word 
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31 2423 16 15 B22 0 


Three-operand instructions include all instructions with bits 31-29 being 0015 
(see Figure 8-3). The source operands, src7 and src2, come from either regis- 
ters or memory. When one or more of the source operands are from memory, 
these instructions are always memory reads. 


If only one of the source operands is from memory (either src7 or src2) and is 
located in internal memory, the data is read during H1. If the single memory 
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source operand is in external memory, the read starts at the beginning of H3, 
with the address presented on the external bus, and completes with the latch- 
ing of the data word at the end of H1. 


If both source operands are to be fetched from memory, then memory reads 
can occur in several ways: 


(1 If both operands are located in internal memory, the src7 read is per- 
formed during H8 and the src2 read during H1, thus completing two 
memory reads in a single cycle. 


(7 If src7 is in internal memory and src2 is in external memory, the src2 ac- 
cess begins at the start of H3 and latches at the end of H1. At the same 
time, the src? access to internal memory is performed during H3. Again, 
two memory reads are completed in a single cycle. 


1 If src7is in external memory and src2Zis in internal memory, two cycles are 
necessary to complete the two reads. In the first cycle, the internal src2 
access is performed. The src7 is also performed, but not latched until the 
next H3. 


(1 If src7 and src2 are both from external memory, two cycles are required 
to complete the two reads. In the first cycle, the src7 access is performed 
and loaded on the next H3; in the second cycle, the src2 access is per- 
formed and loaded on that cycle’s H1. 


8.4.2.3. Operations with Parallel Stores 


Figure 8—4. Multiply or CPU Operation With a Parallel Store 
3 15 8 7 0 


1 24 23 16 


The next class of instructions includes all instructions that have a store in paral- 
lel with another instruction. Bits 31 and 30 for these instructions are equal to 
110. 


For operations that perform a multiply or ALU operation in parallel with a store, 
the instruction word format is shown in Figure 8-4. If the store operation to dsit2 
is external or internal, it is performed during H3. Two bus cycles are required 
for external stores, but only one CPU cycle is necessary to complete the write. 


If the memory read operation is external, it starts at the beginning of H3 and 
latches at the end of H1. If the memory read operation is internal, it is 
performed during H1. Note that memory reads are performed by the CPU 
during the read (R) phase of the pipeline, and stores are performed during the 
execute (E) phase. 
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The instruction word format for instructions that have parallel stores to memory 
is shown in Figure 8—5. If both destination operands, dst? and dst2, are lo- 
cated in internal memory, dst7 is stored during H3 and dsit2 during H1, thus 
completing two memory stores in a single cycle. 


Figure 8—5. Two Parallel Stores 
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31 16 15 8 7 0 


24 23 
fa] sist | sc@]ooo ai[ am | «2 | 


If dst7 is in external memory and dst2is in internal memory, the dst7 store be- 
gins at the start of H3. The dsi2 store to internal memory is performed during 
H1. Two bus cycles are required for the external store, but only one CPU cycle 
is necessary to complete the write. Again, two memory stores are completed 
in a single cycle. 


If dst7 is in internal memory and dsi2is in external memory, an additional bus 
cycle is necessary to complete the dsi2 store. Only one CPU cycle is neces- 
sary to complete the write, but the port access requires three bus cycles. In the 
first cycle, the internal dst? store is performed during H3, and dsi2 is written 
to the port during H1. During the next cycle, the dst2 store is performed on the 
external bus, beginning in H3, and executes as normal through the following 
cycle. 


If dst? and dsi2 are both written to external memory, a single CPU cycle is still 
all that is necessary to complete the stores. In this case, four bus cycles are 
required. 


1) Inthe first cycle, both dst? and dsi2are written to the port, and the external 
bus access for dst? begins. 


2) The store for dst? is completed on the second cycle. 
3) The store for dst2 begins on the third external bus cycle. 


4) Finally, the store for dst2 is completed on the fourth external bus cycle. 
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8.4.2.4 Parallel Multiplies and Adds 


Memory addressing for parallel multiplies and adds is similar to that for three- 
operand instructions. The parallel multiplies and adds include all instructions 
with bits 31-30 equal to 105 (see Figure 8-6). 


Figure 8-6. Parallel Multiplies and Adds 


3 15 87 0 


al 2423 16 
[10] Operation] Plaifda] sot | sro2] sos] sot 


For these operations, src3 and src4 are both located in memory. If both oper- 
ands are located in internal memory, src3 is performed during H3, and src4 is 
performed during H1, thus completing two memory reads in a single cycle. 


If src3 is in internal memory and src4 is in external memory, the src4 access 
begins at the start of H3 and latches at the end of H1. At the same time, the 
src3 access to internal memory is performed during H3. Again, two memory 
reads are completed in a single cycle. 


If src3is in external memory and src4is in internal memory, two cycles are nec- 
essary to complete the two reads. In the first cycle, the internal src4 access 
is performed. During the H3 of the next cycle, the src3 access is performed. 


If src3 and src4 are both from external memory, two cycles are necessary to 
complete the two reads. In the first cycle, the src3 access is performed; in the 
second cycle, the src4 access is performed. 
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External Bus Operation 


The ’C4x has two identical external bus interfaces. One bus is called the global 
memory interface and the other bus is called the local memory interface. 
These buses are designed to allow higher throughput by permitting simulta- 
neous loads and stores to different external memories. 


The information in this chapter applies to both the global memory interface and 
the local memory interface; however, in some sections, only the global 
memory interface is shown. Examples of memory interfacing are provided in 
the TMS320C4x General-Purpose Applications User’s Guide. 
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The ’C4x has two identical parallel external interfaces: the global memory in- 
terface and the local memory interface. Each interface has the following fea- 
tures: 


1 Separate configurations, each with its own 32-bit data bus and 31-bit ad- 
dress bus (24 pin address bus in the ’C44) 


1 Single-cycle reads and pipelined writes 
[1 Independent enable signals for data, address, and control lines 


_j Bus-request and bus-lock signaling for shared memory parallel proces- 
sing 


[1 User-controlled mapping of addresses to either of two sets of independent 
strobes for different soeed memories 


[1 Look-ahead bus status signals for defining current and requested bus op- 
erations for parallel processing arbitration 


L1 Selectable wait states (both software- and hardware-controlled) 


L1 Signals that indicate when memory-page boundaries are crossed. 


Note: 


The global-memory interface is identical in every way to the local memory 
interface except that (1) they have different positions in the memory map, 
and (2) the control signals for the local memory interface are labeled an addi- 
tional “L” prefix (as described in Figure 9-1 on page 9-3). 


Throughout this chapter, no distinction is made between global and local in- 
terface signals and between STRBO and STRB1, except for clarity. 


The signals that indicate when memory-page boundaries are crossed support 
three main types of memory: 


LJ page-mode and static-column decode DRAMs 


J high-speed SRAM banks 


1 slow speed memory banks and I/O devices 
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9.2 Memory Interface Signals 
As shown in Figure 9—1, the global-memory interface has two sets of control 
signals, STRBO and STRB1. The global-memory port control-registers (Sec- 
tion 9.3 on page 9-6) define which set of registers is active. 


Figure 9-1. Global and Local Memory Interface Control Signals 


STRBO Control Signals 


A(30-0) 


AE 
STAT(3-0) 
LOCK 
R/W1 5 
STRB1 
PAGE1 
RDY1 
CE1 


STRB1 Control Signals 


The signals used in this figure are for the global-memory interface. The local-memory interface signals have the same 
configuration and an additional “L” prefix is added for each signal (for example, STRBO becomes LSTRBBO, etc.). 


Note: 
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Table 9-1. Global Memory Interface Signals 


Value After Idle 
Signalt Type§ Description Reset Status 


__ Address bus enable signal for global-memory interface. 
AEQ | When high (set to 1), places address lines A30—0 in the high- N.A.# ignored 
impedance state. 


Control signal enable for R/Wx, STRBx, and PAGEx signals. 
When high (set to 1), it places the corresponding R/Wx, 
STRBx, and PAGEx signals in high-impedance state (x = 0 
for CEO and x = 1 for CE1). 


CE(0,1)9 


N.A. ignored 


= Data bus enable signal for global memory interface. When 
DE! high (set to 1), places data lines D31—0 in the high-imped- N.A. ignored 
ance state. Reads can still occur but writes cannot. 


Lock signal for global bus interface. Indicates whether an 


LOCKt O interlocked access is underway (0 = access underway; 1 1 
1 = access not underway). LOCK is changed only by the in- 
terlocked instructions. 
PAGE(0,1) O/Z | Memory-page enable signal for STRB(0,1) accesses 0 0 
RDY(0,1) | Indicates external memory is ready to be accessed N.A. ignored 
R/W(0,1) O/Z — Specifies memory read (active high) or write (active low) mode 1 1 
STAT(3—0) Four lines that define the status or function of the memory 
O . all 1s all 1s 
$ port as shown in Table 9—2 (next page). 
STRB(0,1) O/Z Interface access strobe 1 1 
Address bus. The address lines are always driven. They kee ponlere 
A(30-0) O/Z y a ie an 7 of last 
the address of the last access. 
access 
D(31-0) VOlZ Data bus. These signals go to high impedance between write Hi-Z Hi-Z 


accesses. 


t The numbers in parentheses mean that either a 0 (zero) or a 1 can follow the prefix shown to the left of the parenthesis. A zero 
indicates STRBO control signals (shown in Figure 9-1), and a one indicates STRB1 control signals. 

+ STAT(3—0) and LOCK cannot be controlled by an external control signal. 

§ O=output; I=input; Z=high-impedance state. 

{ This signal can be used in a shared bus configuration to hold the ’C4x off the shared bus while another ’C4x accesses the shared 
memory and peripherals. 

#N_A. means not affected. 

II Idle status = no external memory access 


Table 9-2 shows how pins STAT3 to STATO define the current status of the 
global-memory port. For bus accesses, these signals provide information 
about the access that is about to begin. The code for a SIGI instruction read 
is useful for distinguishing between a SIGI read and a LDII or LDFI read. 


9-4 
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The bus idle status code is 11115 (given at the bottom of Table 9-2). This sim- 
plifies modular shared-bus multiprocessor interfaces because pull-up resis- 
tors can be used to signal the idle condition when processor cards are not at- 
tached to the shared bus. 


Table 9-2. Global Memory Port Status for STRBO and STRB1 Accesses 


Value at Pins t 
STAT3 STAT2 STAT1 STATO Status 


0 0 0 0 STRBO access, program read 
0 0 1 STRBO access, data read 
0 0 1 0 STRBO access, DMA read 
0 0 1 1 STRBO access, SIGI (instruction) read 
0 1 0 0 Reserved 
0 1 0 1 STRBO access, data write 
0 1 1 0 STRBO access, DMA write 
0 1 1 1 Reserved 
1 0 0 0 STRB1 access, program read 
1 0 0 1 STRB1 access, data read 
1 0 1 0 STRB1 access, DMA read 
1 0 1 1 STRB1 access, SIGI (instruction) read 
1 1 0 0 Reserved 
1 1 0 1 STRB1 access, data write 
1 1 1 0 STRB1 access, DMA write 


1 1 1 1 Idle 


t This table applies to both the global-memory interface and local-memory interface (for local 
memory interface signals, add an L prefix to form LSTAT3, LSTAT2, etc.). 
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9.3 Memory-Interface Control Registers 


9-6 


Figure 9-2 shows the memory map for both the global- and local-memory in- 
terface-control registers. Figure 9-3 shows the fields in each register. Each 
register can be programmed to control its respective memory interface by de- 
fining the: 


[J Page size used for the two strobes at each port 

_j Address ranges over which the strobes are active 
Lj Wait states 

[1 Other operations that control the memory interface 


Figure 9-3 lists the fields in these registers. 


At reset, the binary values shown above each bit in Figure 9—2 are written to 
the global memory interface control register. Values in bits 3—-0 are the values 


at these bits’ respective pins (AE, DE, CE1, and CEO). Reset has the following 
effects (for both the local bus and the global bus): 


.) The PAGESIZE fields for STRBO (bits 18-14) and STRB1 (bits 23-19) are 
set to 001115, which corresponds to 256 words. 


Li The WTCNT fields for STRBO (bits 10-8) and STRB1 (bits 13-11) are set 
to 1115, which corresponds to seven wait states. 


Li The ACTIVE field for STRBO (bits 28—24) is set for all addresses over the 
global (or local for LSTRBO) memory interface. 


Li The STRB SWITCH field (bit 29) is set to 1 to insert a cycle between back- 
to-back reads that switch from STRBO to STRB1 (or STRB1 to STRBO). 


Li The SWW fields for STRBO (bits 5-4) and STRB1 (bits 7-6) are both set 
to 119 to set the internal ready signal to be the logical AND of the external 
READY signal (RDY) and the ready signal generated by the on-chip wait- 
state counter (RDYwicnt)- 
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Figure 9-2. Location of the Memory-Interface Control Registers 


00010 0000h | Global memory interface control register 
00010 0001h 


Reserved 


00010 0003h 


00010 0004h Local memory interface control register 


Figure 9-3. Fields in the ee Control Registers 
24 23 19 


xx | xx | STRB STRB ACTIVE STRB1 PAGESIZE 
SWITCH (Table 9-4, Table 9-5) (Table 9-3) 


ai RW RW RW RW RW RW RW RW RW 
1 1 1 0 0 0 1 1 1 


14 11 
STRBO PAGESIZE STRB1 WTCNT | STRBOWTCNT | STRB1 SWW | STRBO SWW | AE | DE | CE1 | CEO 
(Table 9-3) (Table 9-6) (Table 9-6) 
i oy Bh oe oe oe aN aM A ave Bd Be 
ic 


Notes: 1) Theregister cell figure contains global-memory interface-control register mnemonics. For local-memory interface- 
control register mnemonics, add an L prefix to each mnemonic in the figure (e.g., LSTRB SWW, LCEO, etc.). 


2) The 1s and 0s below each bit are the binary values written to the register at reset. The values at bits 3—-0 are defined 
by the values of their respective external pins (AE, DE, CE1, and CEO). 


3) These registers are shown in the overall memory map in Figure 4—1 and Figure 4-3. 
4) RWe=read/write; R=read. 


T'-OTTTT 
Note: 


Mnemonics used are for the global memory interface control register. For the 
local-memory interface-control register, add the prefix L to each mnemonic 
(e.g., LCEO, LCE1, LSTRB1, etc. The description remains the same for the 


local-memory interface-control register. 
a 


CEO Value of external pin CEO (after it passes through an internal synchronizer). 
The value is not latched. 


CE1 Value of external pin CE1 (after it passes through an internal synchronizer). 
The value is not latched. 


DE Value of external pin DE (after it passes through an internal synchronizer). 
The value is not latched. 


AE Value of external pin AE (after it passes through an internal synchronizer). 
The value is not latched. 
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STRBO SWW Software wait states for STRBO access. In conjunction with STRBO WTCNT, 


STRB1 SWW 


STRBO 
WTCNT 


STRB1 
WTCNT 


STRBO 
PAGESIZE 


STRB1 
PAGESIZE 


STRB 
ACTIVE 


STRB 
SWITCH 


Reserved 


this field defines the mode of wait-state generation. Actual wait states are 
explained in Section 9.4 and in Table 9-6. 


Software wait states for STRB1 access. In conjunction with STRB1 WTCNT, 
this field defines the mode of wait-state generation. Actual wait states are 
explained in Section 9.4 and in Table 9-6. 


Software wait-state count for STRBO accesses. Specifies the number of 
cycles to use when software wait states are active. Three-bit range is from 
0005 (zero) to 1115 (seven). 


Software wait-state count for STRB1 accesses. Specifies the number of 
cycles to use when software wait states are active. Three-bit range is from 
0005 (zero) to 1115 (seven). 


Page size for STRBO accesses. Specifies the number of MSBs of the ad- 
dress to use to define the bank size for STRBO accesses. See ranges in 
Table 9-3 and subsection 9.3.2. 


Page size for STRB1 accesses. Specifies the number of MSBs of the ad- 
dress to use to define the bank size for STRB1 accesses. See ranges in 
Table 9-3 and subsection 9.3.2. 


Specifies address ranges over which STRBOt and STRB1t are active. See 
ranges in Table 9-4 on for STRB ACTIVE and Table 9-5 for LSTRB ACTIVE. 


Inserts a single cycle between back-to-back reads that switch from STRBO 
to STRB1 (or vice versa). 

When a 1, insert cycle. 

When a 0, don’t insert cycle. 


Read as zeros. 
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Table 9-3. Page Size as Defined by STRBO/1 PAGESIZE Bitst 


External Address 
STRBx PAGESIZE External Address Bus Bits Defining 


(Bits 14-18, Bus Bits Defining Address ona Page Size 
19-23)+ the Current Page Page (32-Bit Wds) 
00000-00110 Reserved Reserved Reserved 
0011191 30-8 7-0 28-256 
01000 30-9 8-0 29-512 
01001 30-10 9-0 210=1K 
01010 30-11 10-0 211=2K 
01011 30-12 11-0 212-4K 
01100 30-13 12-0 213_8k 
01101 30-14 13-0 214-16K 
01110 30-15 14-0 215-32K 
01111 30-16 15-0 216-64K 
10000 30-17 16-0 217=128K 
10001 30-18 17-0 218-256K 
10010 30-19 18-0 219-512K 
10011 30-20 19-0 220-=1M 
10100 30-21 20-0 221-2M 
10101 30-22 21-0 222_4M 
101108 30-23 22-0 223_8M 
10111 30-24 23-0 224_16M 
11000 30-25 24-0 225-32M 
11001 30-26 25-0 226-64M 
11010 30-27 26-0 227=128M 
11011 30-28 27-0 228-256M 
11100 30-29 28-0 229_512M 
11101 30 29-0 230=1G 
11110 None 30-0 231-2G 
11111 Reserved Reserved Reserved 


t Mnemonics used are for the global-memory interface-control register. For the local-memory in- 
terface-control register, add the prefix L to the beginning of each mnemonic (e.g., LSTRBO PA- 
GESIZE, LSTRB1 PAGESIZE, etc.). The description is the same for the local-memory interface- 
control register. 

+ The x in STRBx means that the data in the columns are for STRBO or STRB1. 

§ A STRBx PAGESIZE field of 101109 is depicted in Figure 9-5 on page 9-13. 

1 Value at reset. 


External Bus Operation 9-9 


Memory-Interface Control Registers 


Table 9-4. Address Ranges Specified by STRB ACTIVE Bitst 


STRBx AC- 
TIVE Field 
(Bits 24-28) 


00000-01110 
ott 
10000 
10001 
10010 
10011 
10100 
10101 
10110 
10111 
11000 
11001 
11010 
11011 
11100 
11101 
111104 
11111 


STRBO ACTIVE 
Address Range 


Reserved 

8000 0000-8000 
8000 0000-8001 

8000 0000-8003 
8000 0000-8007 
8000 0000-—800F 
8000 0000-—801F 
8000 0000-—803F 
8000 0000-—807F 
8000 0000-—80FF 
8000 0000—81FF 
8000 0000-—83FF 
8000 0000-—87FF 
8000 0000—8FFF 
8000 0000-—9FFF 
8000 0000—BFFF 
8000 0000 —FFFF 


Reserved 


FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 
FFFF 


FFFF 


Size of 
STRBO ACTIVE 
Address Range 


Reserved 
216-64K 
217=128K 
218-256K 
219-512K 
220-1M 
221-2M 
222_4M 
223-8M 
224-16M 
225=32M 
226-64M 
227=128M 
228-256M 
229-512M 
230-=1G 
231-2G 


Reserved 


t Address ranges specified by the LSTRB ACTIVE bits are listed in Table 9-5. 


+ Value at reset. 


STRB1 ACTIVE 
Address Range 


Reserved 

8001 0000—FFFF FFFF 
8002 0000—FFFF FFFF 
8004 0000—FFFF FFFF 
8008 0000—FFFF FFFF 
8010 0000—FFFF FFFF 
8020 0000—FFFF FFFF 
8040 0000—FFFF FFFF 
8080 0000—FFFF FFFF 
8100 0000—FFFF FFFF 
8200 0000—FFFF FFFF 
8400 0000—FFFF FFFF 
8800 0000—FFFF FFFF 
9000 0000—FFFF FFFF 
A000 0OO0—FFFF FFFF 
C000 0000—FFFF FFFF 
None 


Reserved 
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Table 9-5. Address Ranges Specified by LSTRB ACTIVE Bitst 


LSTRBx 

ACTIVE Size of 

Field (Bits LSTRBO ACTIVE LSTRBO ACTIVE LSTRB1 ACTIVE 
24-28) Address Range Address Range Address Range 
00000-01110 Reserved Reserved Reserved 

01111 0000 0000-0000 FFFF 216-64K 0001 0000 —7FFF FFFF 
10000 0000 0000-0001 FFFF 217=128K 0002 0000 —7FFF FFFF 
10001 0000 0000-0003 FFFF 218-256K 0004 0000 —7FFF FFFF 
10010 0000 0000-0007 FFFF 219-512K 0008 0000 —7FFF FFFF 
10011 0000 0000 —000F FFFF 220-1M 0010 0000 —-7FFF FFFF 
10100 0000 0000 —001F FFFF 221-2M 0020 0000 —-7FFF FFFF 
10101 0000 0000 —003F FFFF 222-4M 0040 0000 —-7FFF FFFF 
10110 0000 0000-007F FFFF 223-8M 0080 0000 —7FFF FFFF 
10111 0000 0000 —-OOFF_FFFF 224-16M 0100 0000 —7FFF FFFF 
11000 0000 0000 -01FF FFFF 225-=32M 0200 0000 —-7FFF FFFF 
11001 0000 0000 —-O03FF_ FFFF 226-64M 0400 0000 —-7FFF FFFF 
11010 0000 0000 -07FF FFFF 227-128M 0800 0000 —7FFF FFFF 
11014 0000 0000 -OFFF_FFFF 228_-256M 1000 0000 —7FFF FFFF 
11100 0000 0000 —1FFF FFFF 229-512M 2000 0000 —7FFF FFFF 
11101 0000 0000 -3FFF_ FFFF 230-1G 4000 0000 —7FFF FFFF 
11110# 0000 0000 —7FFF FFFF 231-2G None 

11111 Reserved Reserved Reserved 


T Address ranges below 0030 0000h are valid only in microprocessor mode (ROMEN=0). Access to reserved, peripheral, and 
on-chip memory areas does not activate LSTRB signals. 
+ Value at reset. 
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9.3.1 


Mapping Addresses to Strobes 


Figure 9-4 demonstrates the relationship between the STRB ACTIVE bits 
(see Figure 9-3 on page 9-7 for more information) and the address ranges 
over which the signals, STRBO and STRB1, are active. Note that the address 
ranges of STRBx and LSTRBx also govern the ranges of associated sig- 
nals—RDYx, LRDYx, R/Wx, LR/Wx, PAGEx, LPAGEx, etc. (where x=1 or 0). 


Figure 9-4. Effects of STRB ACTIVE on Global Memory Bus Memory Map 


8000 0000h a 8000 0000h A 
4M Words ate 
803F FFFFh y 
8040 0000h 


2G Words STRBO 


Active 2Gminus/] STRB1 
AM Wotds Active 
FFFF FFFFh v FFFF FFFFh y 
(a) STRB Active=111105 (b) STRB Active=101015 


NOTE: Shown here are two examples for the global memory map. The entire C40 memory 
map (local and global) is shown in Figure 4—1 on page 4-3. Note that the highest ad- 
dress for LSTRB1 (local bus) is 7FFF FFFFh. 


Example (a) of Figure 9—4 shows the reset condition (STRBACTIVE=111105). 
In this case, signal STRBO is active over the entire address range of the global 
memory bus (see Table 9-4 for fields and address ranges of STRB ACTIVE). 


Example (b) of Figure 9-4 shows the global memory bus memory map when 
STRB ACTIVE=101019. In this case, STRBO is active from addresses 
8000 0000h—803F FFFFh, and STRB1 is active from addresses 
8040 OOOOh—FFFF FFFFh (as shown in Table 9—4 for a STRB ACTIVE of 
101015). 


Memory-Interface Control Registers 


9.3.2 Page Size Operation 


Within the memory range selected by any of the four strobe lines, the ’C-4x ex- 
ternal interface allows you to further divide the range into pages of selected 
length. This capability gives you great flexibility in the design of high-speed, 
high-density memory systems combined with slower peripheral devices; each 
time a page boundary is crossed, a cycle is inserted to allow external logic to 
reconfigure itself. 


Each PAGESIZE field in the memory interface control register (shown in 
Figure 9-2 on page 9-7) works in the same manner to specify the page size 
for its corresponding strobe. Table 9-3 on page 9-9 illustrates the relation- 
ship between the PAGESIZE field and the bits of the address used to define 
the current page and the resulting page size. Page size begins at 256 words 
(with external address-bus bits 7-0 defining the address on a page, and 
ranges of up to 2G words (’C40) with external address bus bits 30—0 (’C40) 
defining the location on a page. The example in Figure 9-5 shows how a 
pagesize field value of 10110> is translated into bits 30—23 defining the cur- 
rent page and bits 22—0 defining an address on a page. 


Figure 9-5. STRBx PAGESIZE Fields Example 


External address External address 
bus bits defining bus bits defining 


< the current page oe address on a page > 
30 23 22 0 


Note: This figure represents a STRBx PAGESIZE field value of 101109 (as shown in Table 9-3). 


Changing from one page to another causes a cycle to be inserted in the exter- 
nal access sequence, allowing external logic to reconfigure itself appropriate- 
ly. For example, the extra cycle allows time for slower devices to get off the bus, 
thereby eliminating bus contention. The memory interface control logic keeps 
track of the address used for the last access for each STRB. When an access 
begins, the PAGE signal corresponding to the active STRB goes inactive 
(high) if the access is to anew page. The PAGEO and PAGE 1 signals are inde- 
pendent of one another, each having its own page-size logic. 


At reset, the page-control logic is initialized so that the extra cycle is inserted 
for the first access to the two strobe interfaces. 


The control registers for the local memory interface function in the same way 
as the control registers for the global memory interface. 
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9.4 Programmable Wait States 


The ’C4x has its own internal software-configurable ready-generation capabil- 
ity for each strobe. This software wait-state generator is controlled by configur- 
ing two fields in the global or local interface control register. Use the STRBx 
WTCNT field (bits 8-10 and 11-13) to specify the number of software wait 
states to generate, and use the STRBx SWW field (bits 6-7, and 4—5) to select 
one of the following four modes of wait-state generation: 


Li External RDY (SWW = 0). Wait states are generated solely by the external 
RDY line (software wait-states ignored). 


L) WTCNT-generated RDYwicnt (SWW = 012). Wait states are generated 
solely by the software wait-state generator (external RDY ignored). 


LJ Logical-OR of RDY and RDYwicent (SWW = 102). Wait states are generated 
with a logical OR of internal and external ready signals. Either signal can 
generate ready. 


[J Logical-AND of RDY and RDYwicnt (SWW = 110). Wait states are gener- 
ated with a logical AND of internal and external ready signals. Both signals 
must occur. 


The four modes are used to generate the internal ready signal, RDYjnt; that 
controls accesses. As long as RDYjn} = 1, the current external access is ex- 
tended. When RDYjnt = 0, the current access completes. Since the use of 
programmable wait states for both external interfaces is identical, only the 
global-bus interface is described in this section. 


RDYwtent is an internally-generated ready signal. When an external access is 
begun, the value in WTCNT is loaded into a counter. WTCNT can be any value 
from 0 through 7. The counter is decremented every H1/H3 clock cycle until 
it becomes 0. Once the counter is cleared to 0, it remains cleared to 0 until the 
next access. When the counter is nonzero, RDYwtent=1. When the counter is 
0, RDYwient=0. 


Table 9-6 is the truth table for each value of SWW, showing the different val- 
ues at RDY, RDYwicnt, and RDYjnt. 


=>. OOO ———  ————o—_asasaoaaQaaaaaaaaa——Qaaaa——oo— 


Note: 


At reset, the ’C4x inserts seven wait states for each access to external 
memory. These wait states are inserted to ensure that the system can func- 
tion with slow memories. To increase system performance when using fast 


external memories, you will need to decrease the number of wait states. 
nd 


Programmable Wait States 


Table 9-6. Wait-State Generation for Each Value of SWW 


SWW ‘= =7 a ‘<= 

Value RDY RDYwtcnt RDYint RDYint 
00 0 0 0 RDY int is dependent only upon RDY. 
00 0 1 0 RDY tent is ignored. 
00 1 0 1 
00 1 1 1 
01 0 0 0 RDY jn is dependent only upon 
01 0 1 1 RDYwtent- RDY is ignored. 
01 1 0 0 
01 1 1 1 
10 0 0 0 RDY jnt is the logical-OR (electrical 
10 0 1 0 AND because these signals are low 
10 1 0 0 true) of RDY and RDYwtent- 
10 1 1 1 
11 0 0 0 RDY jnt is the logical-AND (electrical 
11 0 1 1 OR because these signals are low 
11 1 0 1 true) of RDY and RDYwtent- 
11 1 1 1 
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9.5 Memory Interface Timing 


Except for some cases that are covered in detail later in this chapter, the ’C4x 
offers a one-cycle external read and a pipeline external write. A write is consid- 
ered a two-step operation: one cycle writes the data into the external memory 
port buffer and then another cycle moves the data from there to external 
memory. 


ESS | 
Note: 


From the perspective of the DMA or CPU, the write operation finishes in one 
cycle, and the DMA or CPU can proceed. However, if the next DMA or CPU 
access is to the same external bus, the DMA or CPU must wait, and the write 


is considered a two-cycle operation. 
a 


Figure 9-6. STRB and RDY Timing 


Note: The dotted lines emphasize the relationships between the signals. 


As shown in Figure 9-6, STRB changes on the falling edge of H1, and RDY 
is sampled on the falling edge of H1. Throughout the other timing diagrams in 
this section, the following general rules apply to the logical timing of the parallel 
external interfaces: 


1) Changes of R/W are always framed by STRB. 


[J Apage boundary crossing for a particular STRB results in the correspond- 
ing PAGE signal going high for one cycle. 


R/W transitions always occur on the rising edge of H1. 
STRB transitions always occur on the falling edge H1. 
RDY is always sampled on the falling edge H1. 


Data is always sampled during a read on the falling edge of H1. 


BO. BP oe 


Data is always driven out during a write on the falling edge of H1. 
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J Datais always stopped from being driven during a write on the rising edge 
of H1. 


[1 Thestatus and PAGE signals, following a read, change on the falling edge 
of H1. The address also changes on H1’s falling edge. 


J) Thestatus and PAGE signals, following a write, change on the falling edge 
of H1; the address changes on the rising edge of H1. 


Lj The fetch of an interrupt vector over an external interface is identified by 
the status signals for that interface (STAT or LSTAT) as a data read. 


Li The interlocked operation status signals (LOCK and LLOCk) have the 
same timing as the STAT and LSTAT status signals, respectively. 


[J Any time PAGE goes high, STRB goes high. 


TS 


Note: 


When no external port is accessing memory (idle status), the control lines are 
inactive (RDY is ignored, STRB is high, and the STATx lines become high), 
the address lines keep the last value used in the pins, and the data lines be- 
come high-impedance. This can be seen in Figure 9-16. 


Figure 9-7 illustrates a read, read, write sequence. This figure assumes that 
all three accesses are to the same page and that they are STRB1 accesses. 
This timing diagram illustrates that: 


j Back-to-back reads to the same page are single-cycle accesses. 


11 STRB stays low during back-to-back reads. 


[1 When the transition from a read to a write is done, STRB goes high for one 
cycle to frame the R/W signal changing. 
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Figure 9-7. Read Same Page, Read Same Page, Write Same Page Sequence 


cig es Os eg es 
BWO CT, ee a By AR 
Srppe se ee 
RDYO 

ctr es Se es a Te co ee ee See 
pi fo ef 
ae te ee 3 a a ee | 
A30—A0 x x ; x 
STAT3—- = 
STATO, (STRB1 read) K (STRB1 read) (STRB1 write) 

LOCK 


Note: Strobe and Ready Further Defined 


Strobe and ready are discussed from the application viewpoint in 
TMS320C4x General-Purpose Applications User’s Guide. 
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Figure 9-8 shows that: 


[1 To prevent unwanted writes, STRB goes high between back-to-back 
writes to disable the memory while the address changes. 


LJ As in Figure 9-7, STRB goes high between a write and a read, and it 
frames the R/W transition. 


Lj A read following a write on the same bus takes two cycles. This happens 
regardless of whether or not the read is on the same strobe and/or page. 


J Consecutive writes take two cycles. 


Figure 9-8. Write Same Page, Write Same Page, Read Same Page Sequence 


A iO og Us Ho og 


R/WO 


STRBO 
RDYO 

PAGEO | 
Rit ! : : fo ) } 

PAGE1 ! 


STAT3— 1 | 1 1 1 1 1 1 
STATO, | (STRB1 write) (STRB1 read) 
LOCK 


External Bus Operation 9-19 


Memory Interface Timing 


Figure 9-9 shows that going from one page to another on back-to-back reads 
causes: 


Lj An extra cycle to be inserted to allow the next memory to be selected 


Lj The transition to be signaled by PAGE going high for one cycle 


L1 STRB1 to go high for one cycle 


Figure 9-9. Read Same Page, Read Different Page, Read Same Page Sequence 
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Figure 9-10 shows that on back-to-back writes, when a page switch occurs: 


L1 PAGE1 signals this occurrence by going high for one cycle. 


Li No extra cycle is inserted, because write cycles exhibit an inherent one- 
half H1 cycle setup of address information before STRB goes low. 


Figure 9-10. Write Same Page, Write Different Page, Write Same Page Sequence 
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Write Same Page, Read Different Page, Write Different Page Sequence 


Figure 9-11. 
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Figure 9-12. Read Different Page, Read Different Page, Write Same Page Sequence 
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Figure 9-13. Write Different Page, Write Different Page, Read Same Page Sequence 
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Figure 9-14. Read Same Page, Write Different Page, Read Different Page Sequence 
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Figure 9-15 through Figure 9—19 illustrate idle bus cycles. Idle bus cycle tim- 
ing is similar to read cycle timing. The primary differences are that no data is 
read, STRB is held high, and RDY is ignored. 


Figure 9-15. Read Same Page, Idle One Cycle, Read Same Page Sequence 
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Figure 9-16. Write Same Page, Idle One Cycle, Write Different Page Sequence 
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Figure 9-17. Idle, Read Different Page, Idle Sequence 
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Figure 9-18. Idle, Write Same Page, Idle Sequence 
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Figure 9-19. Write Different or Same Page, Idle, Idle Sequence 
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Figure 9-20 illustrates a STRB1 read followed by a STRBO read when 
STRB SWITCH=0. This mode allows the reads to be back-to-back, with no 
cycles inserted between them when they are activating different strobes. 


Figure 9-20. Read Same Page on STRB1, STRBO, and on STRB1 Sequence When 
STRB SWITCH = 0 
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Memory Interface Timing 
Figure 9-21 is similar to Figure 9-20 except that the second STRB1 read is 
from a different page than the first. 


Figure 9-21. Read Same Page on STRB1, STRBO, Read Different Page on STRB1 
Sequence When STRB SWITCH = 0 
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Figure 9-22 illustrates a STRB1 read followed by a STRBO read when 
STRB SWITCH=1. In this mode, a cycle is inserted between back-to-back 
reads that activate different strobes. Some memory configurations require this 
cycle between strobe transitions to prevent bus conflicts during back-to-back 
reads on different strobes. 


Figure 9-22. Read Same Page on STRB1, STRBO, and on STRB1 Sequence When 
STRB SWITCH = 1 
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Figure 9-23 is similar to Figure 9-22 except that the second STRB1 read is 
from a different page than the first. 


Figure 9-23. Read Same Page on STRB1, STRBO, Read Different Page on STRB1 
Sequence When STRB SWITCH = 1 
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Figure 9-24. Write Same Page on STRB1, STRBO, Read Same Page on STRB1 
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Figure 9-25 and Figure 9-26 show one wait-state read and write operations, 
respectively. 


Figure 9-25. Read With One Wait State 
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Figure 9-26. Write With One Wait State 
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9.6 Using Enable Signals to Control Signal Groups 


Figure 9-27. Using Enable Signals to Put Signal Groups in a High-lmpedance State 
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Enable | ; 


Figure 9-27 shows an enable signal controlling the corresponding signal 
group. For example, signal DE controls the global external-interface data sig- 
nals. The enable signals are unsynchronized inputs that turn off the corre- 
sponding output buffers. After the enable signal goes high plus timing (1) in 
Figure 9-27, the corresponding signal group goes into high-impedance. Then, 
after the enable signal goes low plus timing (2) in Figure 9-27, the signal group 
comes out of high-impedance. If the signal group is already in a high-impe- 
dance state before the enable signal goes high, the group will come out of the 
high-impedance state (when the enable signal goes low ) only if the signal 
group is in a state requiring it to do so. For example, a data bus that was not 
being driven will be driven after being enabled, if an access is pending for the 
data bus. 


Ce, | 
Note: 


If you intend to use internally generated wait states, be certain that no data 
is read from or written to the bus when it is disabled. This is because it is pos- 
sible for a bus to be in the high-impedance state with internally generated 
wait states. In this case, data that is written will not be seen externally, and 
data that is read will be whatever value is sampled on the high-impedance 


bus. 
_________—_ ee | 
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Interlocked Operations 


One of the most common parallel processing configurations is the sharing of 
global memory by multiple processors. For multiple processors to access this 
global memory and share data in a coherent manner, some sort of arbitration 
or handshaking is necessary. ’C4x interlocked operations meet this require- 
ment for arbitration. More details are given in Section 9.7.5 on page 9-44. 


Five ’C4x instructions are referred to as interlocked operations. Through the 
use of external signals, these instructions provide powerful synchronization 
mechanisms. They also guarantee integrity of communication and result in a 
high-speed operation. The interlocked-operation instruction group is listed in 


Table 9-7. 


Table 9-7. Interlocked Operations 


Instruction 


LDFI 


LDII 


SIGI 


STF 


STII 


Description 


Load floating-point value from memory into a 
register; interlocked when external memory 
accessed 


Load integer from memory into a register; in- 
terlocked when external memory accessed 


Load floating-point value from memory into a 
register; interlocked when external memory 
accessed 


Store floating-point value from a register to 
memory; interlocked when external memory 
accessed 


Store integer from a register to memory; inter- 
locked when external memory accessed 


Operation 


Signal interlocked 
src — dst 


Signal interlocked 
src — dst 


Signal interlocked 
Clear interlock 


src —> dst 
Clear interlock 


src —> dst 
Clear interlock 


The interlocked operations use the global- and local-bus pins, LOCK and 
LLOCK, to reflect a currently executing interlocked operation. This signal is ac- 
tive (low) when any of the interlocked instructions in Table 9—7 are executing. 


External Bus Operation 
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The external timing for interlocked loads and stores is the same as for standard 
loads and stores. You can extend interlocked loads and stores like standard 
accesses by using the appropriate ready signal (RDYx or LRDYx). 


9.7.1. LDFl and LDIl 


The LDFI and LDII instructions perform the following actions: 


1) Pull (L)LOCK low. 
2) Execute an LDF or LDI instruction. 


3) Extend the read cycle until the appropriate ready signal is received. Com- 
plete the instruction. 


4) Leave (L)LOCK active low until changed by an STFI, STII, or SIGI. 


The read/write operation is identical to any other read/write cycle except for 


the special use of (L)LOCK. The src operand for LDF! and LDII is always a di- 
rect or indirect memory address. (L)LOCK is set to 0 only if the src is located 
off-chip (i.e., STRB or LSTRB is active). If on-chip memory is accessed, then 
(L)LOCK is not asserted, and the operation is as an LDF or LDI from internal 


memory. 


9.7.2 STFl and STIl 
The STFI and STII instructions perform the following operations: 


1) Begin awrite cycle. The state of (L)LOCK does not change. If it is low, an 
interlocked operation occurs. If high, the operation is as if an STF or STI 
is performed (not interlocked). 


2) Execute an STF or STI instruction and extend the write cycle until the ap- 
propriate ready is signaled. 


3) After the write cycle, bring (L)LOCK inactive (high). 


As in the case for LDFI and LDII, the dst of STFI and STII affects (L)LOCK. If 


dst is located off-chip (STRB(0,1) or LSTRB(0,1) is active), (L)LOCK is set to 


a 1. If on-chip memory is accessed, then (L)LOCK is not asserted, and the op- 
erations are as a STF or STI to internal memory. 
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Interlocked Operations 


The SIGI instruction can be used in a variety of ways. In some applications, 
you may wish to modify semaphores externally, perhaps with special-purpose 
logic. If so, SIG] can be used to perform a single-cycle interlocked access of 
the semaphore. The SIGI instruction can also be used simply to perform an 
external read and to signal that a particular point in your code has been 
reached. 


The SIGI instruction functions as follows: 


1) Pulls (L)LOCK low 
2) Executes an LDI instruction 


3) Extends the read cycle until the appropriate ready signal is received. Com- 
pletes the instruction 


4) Brings (L)LOCK back inactive high 


Interlocked operations can be used to implement a busy-waiting loop, to ma- 
nipulate a multiprocessor counter, to implement a simple semaphore mecha- 
nism, or to perform synchronization between two ’C4xs. The following exam- 
ples illustrate the usefulness of the interlocked operations instructions. 


9.7.4 Interlocked Examples 


Examples in this section show you how interlocked operations can be used to 
implement: 


Li A busy-waiting loop to synchronize processors at the software level 
(Example 9-1, page 9-42) 


[1 Acounter shared between cooperative processors that defines the num- 
ber of times a task should be done by the processors (Example 9-2 on 
page 9-42) 


[1 Semaphores to ease the programming of critical sections (Example 9-3 
and Example 9—4 on page 9-48) 


Example 9-1 shows the implementation of a busy-waiting loop. The ’C4x 
stays in this loop until another processor writes a 0 in @LOCK. If location 
LOCK is the interlock for a critical section of code, and a nonzero means the 
lock is busy, the algorithm for a busy-waiting loop can be used as shown. 
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Example 9-1. Busy-Waiting Loop 


1,R0 7Put ab. an. RO 

@LOCK,R1L ;Load lock value into RIL 

RO,@LOCK ;Set lock value to 1 

L1 ;If Rl (previous lock value) is not 
70, read it again 


Example 9-2 shows how a location COUNT may contain a count of the num- 
ber of times a particular operation must be performed. This operation may be 
performed by any processor in the system. If the count is zero, the processor 
waits until it is nonzero before beginning processing. The example also shows 
the algorithm for modifying COUNT correctly. 


Example 9-2. Task Counter Manipulation 


0,RO 
@COUNT,R1 ;Read current value of counter 
WAIT 7 If COUNT try again 


1,R0 ; If COUNT not zero, decrement it 
RO,R1 
R1,@COUNT ;Update COUNT 


Figure 9—28 illustrates multiple ’C4xs sharing global memory and using inter- 
locked instructions as shown in Example 9-3 and Example 9-4. 


Figure 9-28. Multiple 'C4x Devices Sharing Global Memory 
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Implementation of V(S) 


V: LDII @S,RO 
ADDI 1,R0 
STII RO,@S ; Sot. dy SxS 


Implementation of P(S) 
LDI 0,RO 
P: LDIT @S,R1 ;Read semaphore’s current value 
BZD P ;If S = 0, go to P and try again 
LDNZ 1,R0 ;If S is not 0, decrement it 
SUBI RO,R1 
STII R1,@S ;Update S 


Sometimes it may be necessary for several processors to access some 
shared data or other common resources. The portion of code that must access 
the shared data is called a critical section. 


To ease the programming of critical sections, semaphores may be used. 
Semaphores are variables that can take only nonnegative integer values. Two 
primitive, indivisible operations are defined on semaphores (with S being a 
semaphore): 


PS) Po: tt (S == 0), go. to -P 
else S-17> 8S 


Indivisibility of V(S) and P(S) means that when these processes access and 
modify the semaphore S, they are the only processes doing so. 


To enter a critical section, a P operation is performed on a common sema- 
phore, for example, on S (S is initialized to 1). The first processor performing 
P(S) will be able to enter its critical section. All other processors are blocked 
because S has become 0. After leaving its critical section, the processor per- 
forms a V(S), thus allowing another processor to execute P(S) successfully. 


The ’C4x code for V(S) is shown in Example 9-3, and code for P(S) is shown 
in Example 9-4. Compare the code in Example 9-4 to the code in 
Example 9-2, which does not use semaphores. 
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9.7.5 Bus-Lock Pins and Bus Timing 
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The timing of the LOCK and LLOCK pins is the same as the timing of the 
STAT(38—0) and LSTAT(3—0) pins. The LDII, LDFI, ,STII, STFI, and SIGI 
instructions manipulate the bus-lock signals only when an external memory 
access is made. 


LDII, LDFI, and SIGI all clear LOCK or LLOCK to zero at the beginning of the 
read cycle with H1 falling. STIl, STFI, and SIGI all set LOCK or LLOCK to one 
at the end of the access cycle on the falling edge of H1. Interlocked instructions 
are explained in Section 9.7. 


Figure 9—29 through Figure 9-32 show bus timing characteristics for several 
external accesses using STII, LDII, STFI, LDFl, and SIGI. 


Interlocked Operations 


Figure 9-29 is an example of an LDII or LDFI external access. 


Figure 9-29. LDII or LDFI External Access 
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Figure 9-30 is an example of STII or STFI external access following the pre- 
vious interlocked load (shown in Figure 9—29) and an idle cycle. This is the tim- 
ing for an interlocked load/interlocked store sequence. 


Figure 9-30. LDII or LDFI and STII or STFI External Access 


Riwo ! ; 
ms. . 2 = «ao ee 
RDYo 7 
ee ae ae cee ee ee eee ee ee cee ee 
RAW1 , , | | , | | \ : ! : ! / | 
iN Vie. - 


STAT3— (STRB1 read MSTRB1 read (idle) (STRB1 write) 
STATO 1 1 1 1 1 1 1 1 1 
LOCK | a ar i i i i a 
<—_>-— LDII or LDFI external access | 
STII or STFI external access le >! 


9-46 


Interlocked Operations 


Figure 9-31 is an example of a SIGI external access. 


Figure 9-31. SIGI External Access Timing 
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Figure 9-82 illustrates timing for SIGI if the LOCK signal is already low. This 
could occur when a SIGI follows an LDII instruction. Since LOCK is already 
low, the only effect SIGI has on LOCK is to bring it high. 


Figure 9-32. SIG! When LOCK Is Already Low 
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IACK Timing 


The IACK pin is affected by the IACK (interrupt acknowledge) instruction. The 
timing of the pin is similar to that of the LOCK pin when used by the SIGI in- 
struction. In all respects (timing, extension with wait states, etc.) the IACK be- 
haves like aLOCK or STAT signal. The only difference is that there is only one 
IACK pin. 


The timing for the [ACK pin is shown in Figure 9-33. Like the interlocked in- 
structions, the IACK instruction affects IACK only for an external access. 
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Figure 9-33. IACK Timing 
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Chapter 10 


The Bootloader 


The bootloader provided in the on-chip ROM of the ’C4x can load and execute 
source programs that are received from a host processor, an EPROM, or a 
standard memory device. The ’C4x bootloader functions primarily as either a 


memory bootloader or as a communication port bootloader. 
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Boot Loader Description 


10.1 Bootloader Description 


10-2 


The bootloader code starts at location 0x11bc in the on-chip ROM in both the 
’°C40 and ’C44. For ’C44 device revisions < 1.0, the C44 bootloader code is 
identical to the ‘C40 bootloader code. For’C44 device revisions > 1.0, the ’C44 
bootloader code differs in three memory locations from the C40 bootloader. 
These three locations are noted in the code. The bootloader program is listed 
in Section 10.7, The Bootloader Program. 


Mode Selection 


10.2 Mode Selection 


The ’C4x bootloader functions primarily as either a memory bootloader or a 
communication port bootloader. Bootloader mode selection is determined by 
the IIOF(8—0) pins, as described in Table 10—1 and shown in Figure 10-1. 


(J The memory bootloader supports user-definable byte, half-word, and full- 
word data formats, which allow the flexibility to load a source program from 
memories having widths of 8 bits, 16 bits, or 32 bits. The source programs 
to be loaded must reside in one of six predefined memory locations, as 
listed in Table 10—1. STRBO (LSTRBO) should be used because they are 
the active strobes after reset. Figure 10—2 shows the flow for the memory 
bootloader. 


(J The communication port bootloader waits for the first data input from one 
of the six (C40) or four (C44) communication port channels and uses that 
channel to perform the bootload. The format of the incoming data stream 
is similar to that fora memory data stream, except that the source memory 
width is excluded (the format is described in Table 10-2). Figure 10-3 
shows the flow of the communication port bootloader. 


Table 10-1. Bootloader Mode Selection Using Pins IlIOF(3-—0) 


External Pin Source Program Location 
‘TOFS MOF2 iOF1 iOFO C40 C44 
1 1 0 1 0030 0000h 0030 0000h 
1 0 1 1 4000 0000h 4000 0000ht 
1 0 0 1 6000 0000h 0080 0000h 
0 1 1 1 8000 0000h 8000 0000ht 
0 1 0 1 A000 0000h 8040 0000ht 
0 0 1 1 C000 0000h 8080 0000ht 
0 0 0 1 Reserved (the boot- Reserved (the boot- 
loader terminates) loader terminates) 
1 1 1 1 Communication port Communication port 


t The 'C44 external-address buses each have only the low 24 bits of the internal address bus. 
Thus, the internal address 4000000h maps to Oh on the local bus. Any address at or above 
80000000h is mapped to the global bus; 80800000, for example, maps to address 00800000h 
on the global bus. 
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Figure 10-1. Mode Selection Flow 
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10.3 Bootloading Sequence 


Here is the general sequence of events in bootloading a source program: 


1) 


Select the bootloader by resetting the ‘C4x while driving the 
RESETLOC(1,0) pins low, the on-chip ROM enable pin (ROMEN) high, 
and the IIOFO pin high. The ROMEN pin must be high during bootloader 
execution, but it can be changed anytime after bootloading. 


The status of external pins IIOF(3—1) indicates where to find the source 
program to be loaded (memory or communication port). These options are 
listed in Table 10-1. Pins IIOF(3—1) are read as the IIOF flags in the CPU 
IIF register. The bootloader takes the following steps to determine the 
source program’s location, as is shown in Figure 10-1. 


a) If an IIF(3—1) value of from 1105 to 0015 (6 to 1) is found, the source 
program is loaded from the corresponding memory address shown in 
the top six lines of Table 10-1. See Figure 10—2 for details on boot- 
loader memory flow. 


b) The IIF(3—1) value of 000 (0) is reserved. You should not use this 
mode. 


c) If none of the combinations 0005 — 1105 are found, the bootloader 
program assumes that loading will be via a communication port, and 
it starts checking communication port input channels (in the order port 
0 through port 5). If it finds no inputs from a communication port, the 
program returns to checking the status of the IIOF(3—1) pins again. 
See Figure 10-3 for details on bootloader communication port flow. 


When the source program’s data stream is found, the program is loaded 
at the address found in the fifth word of the data stream (the format is 
shown in Table 10-2), using the bus width specified in the first word (8, 16, 
or 32 bits wide). The bootloader cannot load the source program to any 
location below 0000 1000h, unless the address decode logic is remapped. 
The first five words of the source program specify its loading and execution 
criteria. Remaining words are the source program(s) and vector table 
pointers as shown in Table 10-2. 


An IACK instruction is executed, indicating the completion of the bootload 
sequence. This indication can then be used to switch from microcomputer 
(ROMEN = 1) to microprocessor mode (ROMEN = 0). You do not need to 
reset the ’C4x to change the ROMEN pin. However, ensure that the ’'C4x 
will not access addresses 0000 0000h to 0000 OFFFh during the change. 


The source program is executed (entry point is the first word of the first 
loaded program). 
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Bootloading Sequence 


Figure 10-2. Memory Load Flow 


Memory load 
Branch to source 
program address 
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Load block size 
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Block size= 0? 
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Load destination 
address 


Transfer 32-Bit data from 


source to destination 


End of block ? 
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Set IVTP register 


Set TVTP register 


Execute IACK 


Branch to destination 


address of first 
block loaded 


Begin program execution 


Bootloading Sequence 


Figure 10-3. Communication-Port Load Mode Flow 
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Bootloading Sequence 


The data stream with its source program(s) should be in the format shown in 
Table 10-2. The contents of words 4 through n vary for the different source 
programs loaded throughout the entire data stream. The first three words and 
the last three words are nonvariables that affect each of the source-program 
blocks. The eight least significant bits (LSBs) of the first word specify the 
memory width. If byte or half-word wide is selected, the loading sequence is 
from LSBs to MSBs. 


Table 10-2. Structure of Source Program Data Stream 


Word 
1 


2 
3 
4 


o1 


n+4 


Contents 

Memory width where source program resides (8, 16, or 32 bits wide) 

Value to set in the global memory interface control register (shown in Figure 9—2). 
Value to set in the local memory interface control register (shown in Figure 9—2). 


Block size in 32-bit words of the first program block to be loaded (after the number of words is 
loaded, the next word should be all zeros; if not, another block is assumed to follow). 


Address where the source program is to be loaded. 
First word of source program. 


Last word of source program (the program organized as words 4 through n — these shaded 
words). 


Word of all zeros. (Note that if several source-program blocks were sent, word n above would be 
the last word of the /ast source-program block. Each source-program block would have the format 
shown in words 4 through n. This word of all zeros follows the /ast source program block). 


IVTP value (interrupt vector table pointer, see Section 3.2). 
TVTP value (trap vector table pointer, see Section 3.2). 


Memory location for [ACK instruction (see IACK instruction in Chapter 14). 


Note: The shaded area identifies the source program block. 
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Each source program in a multiple block program transfer can be loaded to dif- 
ferent specified destinations. Each program block specifies its program’s size 
and destination address at the beginning of the block. End the entire block pro- 
gram loader function by following the last block with an all-zero word 
(0000 0000h). 


Bootloading Sequence 


The second and third last words of the source memory define the interrupt vec- 
tor table pointer (IVTP) and the trap vector table pointer (TVTP). The last word 
of the source memory defines the memory location for the IACK instruction. 
The IACK instruction brings the IACK signal low as data is read, if the memory 
location specified in the [ACK instruction is in external memory that is available 
in the system. Finally, the processor begins execution of the first code block. 


It is assumed that at least one block of source will be loaded when 


the bootloader is invoked. Initial loader invocation with a block size 
of 0000 0000h produces unpredictable results. 
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Bootloading from External Memory (Examples) 


10.4 Bootloading from External Memory (Examples) 
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When the ’C4x’s ROMEN input pin is high and RESETLOC(1,0)=005 during 
reset, the memory bootloader can load programs stored in off-chip memory 
(typically 8-, 16-, or 32-bit ROMs) at an address determined by the IIOF pins 
to any valid external or internal memory in the ’C4x’s memory map. 


Because address zero (0) is reserved for the bootloader, address 


zero should not be used for the reset vector when a user-defined, 
internal ROM-code mask is used. 


The 8 LSBs of the first word of data read stream specify the memory width (8, 
16, or 32 bits) as shown in Table 10-3, Table 10-4, and Table 10-5. 


[j} 8-bit memories: 08h 
_j} 16-bit memories: 0010h 
_j 32-bit memories: 0000 0020h 


If 8- or 16-bit external memories are used, the loading sequence is from LSBs 
to MSBs. The bootloader reads the contents of 16-bit wide memories (least 
significant half word first) and packs each pair of 16-bit half words to make a 
32-bit word before loading each word to memory. Accordingly, the bootloader 
reads the contents of byte-wide memories (least significant byte first) and 
packs each group of four bytes into a 32-bit word before loading each word to 
memory. Because the bootloader packs bytes before loading, no external 
hardware is needed to pack the loaded bytes into a 32-bit word. For 32-bit wide 
external memories, no byte packing is necessary, because the memory data 
width matches that of the ’C4x. 


For 16-bit memories, the data read is expected to be in bit positions 0-15. 
Thus, the half-word memory’s data lines should be interfaced to ’C4x data lines 
(L)D15—0. For byte-wide memories, the data read is expected to be in bit posi- 
tions 0—7. Hence, the byte-wide memory’s data lines should be interfaced to 
’C4x data lines (L)D7—0. Even though the ’C4x does not require that unused 
data lines be pulled up to Vcc, it is recommended that each unused data line 
be pulled up through separate 22 KQ resistors to 5 volts for minimum power 
dissipation. 


Table 10-3, Table 10-4, and Table 10—5 show example data streams for 8-bit, 
16-bit, and 32-bit wide configured memories, respectively. 


Bootloading from External Memory (Examples) 


These examples assume that: 


Lj The status of the IIOF(0-3) pins is 1105 after reset is deasserted (memory 
load from 0030 0000h — see Table 10-1). 


Lj The source program resides at memory location 0030 0000h and defines 
the following: 


Memory width for bootloader: 8, 16, or 32 bits 


Global bus memory with one software wait state, external RDY (SWW 
= 11), page size = 64K words for both STRBO and STRB1, and an ac- 
tive address range = 1G words for both STRBO and STRB1. 


Local memory bus that requires two software wait states (SWW = 01), 
page size = 32K words, and active address range = 1G words for both 
STRBO and STRB1. 


First block program of 294 words in length and whose destination ad- 
dress is at O02F F840h. 


Second block program of 64 words in length and whose destination 
address is at O02F F800h. 


IVTP and TVTP, which are overlapped and point to the beginning of 
the on-chip RAM. 


Memory location of 0030 0000h for IACK instruction. 


Table 10-3. Byte-Wide Configured Memory 


Word 
1 


Address 
0030 0000h 


0030 0001h 
0030 0002h 
0030 0003h 
0030 0004h 
0030 0005h 
0030 0006h 
0030 0007h 


Value Comments 


Memory width = 8 bits 


Global memory bus control word = 1D7B C9FOh 


(Described in Figure 9—2 on page 9-7) 
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Bootloading from External Memory (Examples) 


Table 10-3. Byte-Wide Configured Memory (Continued) 


Word 
3 


299 


300 


301 


302 
to 
365 


Note: 
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Address 
0030 0008h 


0030 0009h 
0030 O000Ah 
0030 OOOBh 
0030 000Ch 
0030 000Dh 
0030 OOOEh 
0030 0010h 
0030 0011h 
0030 0012h 
0030 0013h 
0030 0014h 


0030 04ABh 
0030 04ACh 
0030 04ADh 
0030 04AEh 
0030 04AFh 
0030 04B0h 
0030 04Bih 
0030 04B2h 
0030 04B3h 
0030 04B4h 


0030 05B3h 


Value 
50h 


92h 
73h 
1Dh 
26h 
Oth 
00h 
40h 
F8h 
2Fh 
00h 


00h 


Comments 
Local memory bus control word = 1D73 9250h 


(Described in Figure 9-2 on page 9-7) 


1st source program block size = 126h 


1st source program block starting addr = 002F F840h 


1st source program block starts here (first word) 
e 


1st source program block ends here (last word) 


2nd source program block size = 40 


2nd source program block starting addr = 002F F800h 


2nd source program block starts here (first word) 
e 


2nd source program block ends here (last word) 


The shaded area identifies the source program block. 


Bootloading from External Memory (Examples) 


Table 10-3. Byte-Wide Configured Memory (Continued) 
Word Address Value Comments 

366 0030 05B4h 00h Value 0 to terminate the program block load 
0030 O5B5h 00h 
0030 O5B6h 00h 
0030 05B7h 00h 

367 0030 05B8h 00h IVTP = 002F F800h 
0030 05B9h F8h 
0030 O5BAh 2Fh 
0030 O5BBh 00h 

368 0030 05BCh 00h TVTP = 002F F800h 
0030 O5BDh F8h 
0030 O5BEh 2Fh 
0030 O5BFh 00h 

369 0030 05COh 00h Memory location for IACK instruction =0030 0000h 
0030 05C1h 00h 
0030 05C2h 30h 


0030 05C3h 00h (This is the final word in the data stream.) 


Note: The shaded area identifies the source program block. 
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Bootloading from External Memory (Examples) 


Table 10-4. 16-Bit Wide Configured Memory 


Comments 


Memory width = 16 bits 

Global memory bus control word = 1D7B C9FOh 
Local memory bus control word = 1D73 9250h 
1st program block size = 126h 

1st program block starting addr.= 002F F840h 

i st program block starts here (first word) 


1st program block ends here (last word) 


2nd program block size = 40h 

2nd program block starting addr.= 002F F800h 
2nd program block starts here (first word) 

e 


2nd program block ends here (last word) 


Value 0 to terminate the program block load 
IVTP = 002F F800h 


TVTP = 002F F800h 


Word Address Value 
1 0030 0000h 0010h 
0030 0001h 0000h 
2 0030 0002h C9FOh 
0030 0003h 1D7Bh 
3 0030 0004h 9250h 
0030 0005h 1D73h 
4 0030 0006h 0126h 
0030 0007h 0000h 
5 0030 0008h F840h 
0030 0009h 002Fh 
6 0030 O00Ah 
to : 
299 : : 
e e 
0030 0255h 
300 0030 0256h 0040h 
0030 0257h 0000h 
301 0030 0258h F800h 
0030 0259h 002Fh 
302 0030 025Ah 
to © 
365 .: 2. 
e e 
0030 02D9h 
366 0030 02DAh 0000h 
0030 02DBh 0000h 
367 0030 02DCh F800h 
0030 02DDh 002Fh 
368 0030 02DEh F800h 
0030 02DFh 002Fh 
Note: The shaded area identifies the source program block. 
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Bootloading from External Memory (Examples) 


Table 10-4. 16-Bit Wide Configured Memory (Continued) 


Word 
369 


Note: 


Address 
0030 02E0h 


0030 02E1h 


Value 


0000h 
0030h 


Comments 


Memory location for [ACK instruction = 0030 0000h 


(This is the final word in the data stream.) 


The shaded areas identify the source program blocks. 


Table 10-5. 32-Bit Wide Configured Memory 


Word 


Note: 


Address 
0030 0000h 


0030 0001h 
0030 0002h 
0030 0003h 
0030 0004h 
0030 0005h 


0030 012Ah 
0030 012Bh 
0030 012Ch 
0030 012Dh 


0030 016Ch 
0030 016Dh 
0030 016Eh 
0030 016Fh 
0030 0170h 


Value 

0000 0020h 
1D7B C9FOh 
1D73 9250h 
0000 0126h 
002F F840h 


0000 0040h 
002F F800h 


0000 0000h 
002F F800h 
002F F800h 
0030 0000h 


Comments 

Memory width = 32 bits 

Global memory bus control word = 01D7B C9FOh 
Local memory bus control word = 01D73 9250h 
1st program block size = 126h 

1st program block starting addr = 002F F840h 
1st program block starts here (first word) 

e 


1st program block ends here (last word) 

2nd program block size = 40h 

2nd program block starting addr = 002F F800h 
2nd program block starts here (first word) 

e 


2nd program block ends here (last word) 
Value 0 to terminate the program block load 
IVTP = 002F F800h 

TVTP = 002F F800h 


Address location for IACK instruction = 00030 0000h 


The shaded areas identify the source program blocks. 
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Bootloading from a Communication Port (Examples) 


10.5 Bootloading from a Communication Port (Examples) 
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A value of all 1s on IIOF(0-3) signals that the source program is being trans- 
mitted via a communication port. Bringing all four of the ITOF(0—3) pins high 
also allows the pins to be used as interrupt lines without any external decode 
logic. With pins ITOF(0-3) all high at reset, the ’C4x polls the input level of each 
port to determine which channel contains the program. The input data se- 
quence of the communication bootloader is the same as that of the memory 
bootloader except that it lacks the source memory width definition (because 
the memory width of the communication port bootloader is fixed). 


Example 10-1 is a program listing for booting a multiprocessor system. 


After a 32-bit boot from external memory, the master ’C4x boots — via a com- 
munication port—another ’C4x (slave processor) connected to communica- 
tion port 0 of the master processor. Both processors stay in an infinite loop after 
booting. The code should be loaded in the master ’C4x EPROM in the correct 
memory location according to the IIOF settings of the master 'C4x. All IIOF pins 
of the slave processor should be set to 1. The ROMEN pin is enabled 
(ROMEN=1) and the RESETLOC(1,0) pins are low in both processors. Fora 
description of how to convert an executable COFF file into an EPROM pro- 
grammer format, see the hex conversion utility in the 7MS320 Floating Point 
Assembly Language Tools User’s Guide (literature number SPRUO35). 


Bootloading from a Communication Port (Examples) 


Example 10-1. Booting a ’C4x Multiprocessor System 


MASTER PROCESSOR BOOT TABL 


Fl 


exe 

.word 32 7 memory width 

.word 3003c000h ; MASTER global control register 
; (system specific !!) 

.word 3d79c210h ; master local control register 


; (system specific !!) 


* 


MASTER PROCESSOR PROGRAM BLOCK 


.word 10 ; block size 
.word 2££800h ; block dest addr 


* Code for master processor: this code sends boot table to slave processor 
ldi 8,rc ; loop 9 times: size of slave processor 
; boot table 
rptbd endbl 
ldp src ; src in external memory 
ldi @src,ar0 
ldi @dst,arl 


dit *arQ++ (1), 20 ; block start 
endbl: sti r0,*arl 
bu $ ; master processor loops forever 
sro -word BOOT_TABLE2 ; address of boot table of slave 
; processor 
dst .word 100042h ; address of OFIFO connected to slave 


, Processor 


* END OF ALL BLOCKS 
* 
.word 0 ; master end of bootload sequence 
.word 2£fd00h ; master IVTP value 
.word 2ff£d00h ; master TVTP value 
.word 40000000h ; master address for iack 
* 
x END OF MASTER PROCESSOR BOOT TABLE : size = 9 words 
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Bootloading from a Communication Port (Examples) 


Example 10-1. Booting a 'C4x Multiprocessor System (Continued) 


* 


* 


* 


SLAVE 


PROCESSOR BOOT TABLE 


BOOT_TABLE2: 


slave BOOT TABLE 
slave global control register 
(system specific !!!) 

slave local control register 
(system specific !!!) 


slave processor loops forever 
slave end of bootload sequence 


slave address for iack 


word 3003c000h 

word 3d79c210h 

word it block size 

.word 2ff£800h dst load address 
bu $ 

word 0 

word 2£fd00h slave IVTP value 
word 2£fd00h slave TVTP value 
word 40000000h 

END OF EPROM CODE 


10-18 


Modifying the IIOFx Pins After Bootloading 


10.6 Modifying the IIOFx Pins After Bootloading 


The load options are based upon the status of IIOF(8—0) as general-purpose 
input pins. Therefore, to select the correct bootloader mode, pins IIOF(3—0) 
must be kept at a constant valid status value (see Table 10-1 for a list of val- 
ues). 


After the bootload is complete, the [ACK signal is brought low until the read 
phase in the pipeline finishes. Figure 10—4 shows an example circuit that gen- 
erates the IIOF(3—0) signals for bootload selection and, after bootload opera- 


tion, allows incoming external interrupts. In this example, after reset, the IIOF 
pins stay low until the [ACK signal is received. 


Figure 10-4. Circuit for Generation of a Low IIOF Signal for Bootloader Selection 
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10.7 The Bootloader Program 
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The Bootloader Program 


z TMS320C4x PROCESSOR BOOTLOADER * 
KKK K KKK KKK KKK KKK KKK KK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKKKKKKKK KKK KK 
BOOT: LDI COM_LOAD, R10 ; Comm. port load subroutine address -> R10 
LDHI 0010H, ARO ; Load peripheral mem. map start addr 100000H 
CHECK HE IIOF1-3 FOR THE BOOTLOADER 
CHECK: LDHI 0030H, AR1L ; Load memory address = 00300000H 
CMP I 04404H, IIF ; Test function 110 condition 
BEQ MEMORY ; If true, execute memory bootloader 
LDHI 04000H, AR1 ; Load memory address = 40000000H 
CMP I 04044H, IIF ; Test function 101 condition 
BEQ MEMORY ; If true, execute memory bootloader 
, 
LDHI 06000H, AR1 ; Load memory address = 6000000H 
; 'C44: LDHI 00080h,AR1; replace previous line with this line (’C44) 
, 
CMP I 04004H, IIF ; Test function 100 condition 
BEQ MEMORY ; If true, execute memory bootloader 
LDHI 08000H, AR1 ; Load memory address = 80000000H 
CMP TI 00444H, IIF ;, Test function 011 condition 
BEQ MEMORY ; If true, execute memory bootloader 
7 
LDHI OAOOOH, AR1 ; Load memory address = AOQ0O0Q0000H 
; ‘C44: LDHI 08040,AR1 ; replace previous line with this line (’C44) 
, 
CMP I 00404H, IIF ; Test function 010 condition 
BEQ MEMORY ; If true, execute memory bootloader 
, 
LDHI OCOOOH, AR1 ; Load memory address = CO000000H 
; ‘C44: LDHI 08080H ; replace previous line with this line (’C44) 
7 
CMP I 00044H, IIF ; Test function 001 condition 
BEQ MEMORY ; If true, execute memory bootloader 
CMPI 00004H, LIF ; Test function 000 condition 
BEQ STATRAM , If true, branch to STATIC RAM TEST 
* 
* COMMUNICATION PORT BOOTLOADER 
* 
* 
* CHECK COMMUNICATION PORT INPUT CHANNEL 
* 
ADDI 040H, ARO, AR3 ; Point to comm. port 0 control register addr 
LDI 5,AR1 ; Set loop counter for CHECK_CH loop 
CHECK_CH: LSH3 -—9,*AR3,R1 ; Check comm port input 
BNZ LOADO ; If input exist, start comm port loader 
ADDI 010H, AR3 ; Point to next comm. port channel addr 
DBU AR1, CHECK_CH ; Check next comm. port channel input 
B CHECK ; Recheck the input flags 


The Bootloader 
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The Bootloader Program 


* 
ss MEMORY BOOTLOADER 
* 
* 
x TEST MEMORY WORD WIDTH 
* 
MEMORY: LDI *AR1++(1),R1 ; Load the memory word width 
LDI W_WIDE, R10 ; Full-word size subroutine address -> R10 
LSH 26,R1 ; Test bit5 of mem. width word 
BN LOADO ; If ’1’ start PGM loading (32 bits width) 
NOP *ARI++ (1) ; Jump last half word from mem. word 
LDI H_WIDE,R10 ; Half-word size subroutine address -> R10 
LSH 1,R1 ; Test bit4 of mem. width word 
BN LOADO ; If ’1’ start PGM loading (16 bits width) 
NOP *ARI++ (1) ; Jump last 1 bytes from mem. word 
LDI B_WIDE, R10 ; Byte size subroutine address -> R10 
NOP *ARI++ (1) ; Jump last 1 bytes from mem. word 
* 
* START PROGRAM LOADING 
* 
LOADO: LAJU R10 ; Load new word according to mem. width 
LDHI 0010H, ARO ; Load peripheral mem. map start addr 100000H 
LDI gO) ; Set start address flag off 
OP 
LAJU R1O ; Load new word according to mem. width 
STI AR2, *ARO ; Set global bus control register 
OP 
OP 
STI AR2,*+ARO (4) ; Set local bus control register 
LOAD2: LAJU R10 ; Load new word according to mem. width 
ADDI 1, RO ; Set start address flag off 
OP 
OP 
CMP I 0,AR2 , If 0 block size start PGM 
BEQ IVTP_LOAD 
LAJU R1O ; Load new word according to mem. width 
SUBI3 1,AR2,RC ; Set block size for repeat loop 
NOP 
SUBI LRLO ; Sub address with loop 
LDI RO, RO ; Test start address loaded flag 
LDIP AR2,R9 ; Load start address if flag off 
LAJU R10 ; Load block words according to mem. width 
LDI AR2, ARO ; Set destination address 
LDI -1L,R0 ; Set start & dest. address flag on 
ADDI 1,R10 ; Sub address without loop 
B LOAD2 ; Jump to load a new block when loop completed 
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* INITIALIZE IVTP AND TVIP REGISTERS 
* 
IVTP_LOAD: LAJU R10 ; Load new word according to mem. width 
NOP 
NOP 
NOP 
TVTP_LOAD: LAJU R10 ; Load new word according to mem. width 
DPE AR2, IVTP ; Load the IVTP pointer 
NOP 
NOP 
LAJU R10 ; Load new word according to mem. width 
DPE AR2,TVTP ; Load the TVTP pointer 
NOP 
NOP 
TACK *AR2 ; Send out IACK signal out 
BU R9 ; Branch to the start of the program 
, € 
. BYTE-WIDE MEMORY BOOTLOADER SUBROUTINE ; 
, rs 
LOOP_B RPTB LOAD_B 7 PGM load loop 
B_WIDE WLO *AR1++(1),AR2 ; Load byte 0 (LSB) 
OP ; Nop for STRB to go high 
LWL1 *AR1++(1),AR2 ; Join byte 1 with byte 0 
OP ; Nop for STRB to go high 
LWL2 *AR1++(1),AR2 ; Join byte 2 with byte 0 & 1 
OP ; Nop for STRB to go high 
LWL3 *AR1++(1),AR2 ; Join byte 3 with byte 0, 1, & 2 
LDI RO, RO ; Test load address flag 
BNN B_END 
LOAD_B STI AR2, *ARO++ (1) ; Store new word to dest. address 
B_END BU R11 ; Return from subroutine 
, ' 
; HALF-WORD WIDE MEMORY BOOTLOADER SUBROUTINE ; 
, £ 
LOOP_H RPTB LOAD_H ; PGM load loop 
H_WIDE WLO *AR1++(1),AR2 ; Load LSB half-word 
OP 7 Nop for STRB to go high 
WL2 *AR1++(1),AR2 ; Join MSB half-word with LSB half-word 
LDI RO, RO ; Test load address flag 
BNN H_END 
LOAD_H STI AR2, *ARO++ (1) ; Store new word to dest. address 
H_END BU R11 ; Return from subroutine 
, , 
i FULL-WORD WIDE MEMORY BOOTLOADER SUBROUTINE r, 
, cf 
LOOP_W RPTB LOAD_W ; PGM load loop 
W_WIDE DI *AR1++(1),AR2 ; Read a new 32 bits word 
LDI RO,RO ; Test load address flag 
BNN W_END 
LOAD_W STI AR2, *ARO++ (1) ; Store new word to dest. address 
W_END BU R11 ; Return from subroutine 
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’ 


’ 


COMMUNICATION PORT BOOTLOADER SUBROUTIN 


fl 


i 
LOOP_C 
COM_LOAD 


10-24 


RPTB 
LSH3 
BZ 
LDI 
LDI 
BNN 
STI 
BU 
.end 


LOAD_C 
-9,*AR3,R1 
COM_LOAD 
*+AR3 (1) ,AR2 
RO, RO 

C_END 

AR2, *ARO++ (1) 
R11 


PGM load loop 
Check comm port input 
Wait for comm port input 
Read a new 32 bits word 
Test load address flag 


Store new word to dest. address 
Return from subroutine 


Chapter 11 


The DMA Coprocessor 


The direct memory access (DMA) coprocessor is a programmable on-chip de- 
vice that allows simultaneous memory transfer and CPU operation with mini- 
mum CPU overhead. This chapter describes the DMA coprocessor and also 
offers suggestions for programming the device. 
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Introduction 


11.1 Introduction 


The DMA coprocessor is a self programmable peripheral that transfers blocks 
of data by maximizing sustained CPU performance and by alleviating the CPU 
of burdensome I/O duties. 


a 


Transfers to and from anywhere in the processor’s memory map. For ex- 
ample, transfers can be made to and from on-chip memory, off-chip mem- 
ory, and any of the six on-chip communication ports. 


Six DMA channels for memory-to-memory transfers in unified mode; a 
special split mode supports 12 DMA channels for communication port to/ 
from memory transfers. 


Automatic initialization of registers via linked lists stored in memory, allow- 
ing the DMA to run continuously without intervention by the CPU. 


Concurrent CPU and DMA coprocessor operation with DMA transfers at 
the same rate as the CPU (supported by separate internal DMA address 
and data buses) 


Source and destination address registers with variable indices, making it 
possible to step through matrices by row or column 


Bit-reversed addressing for FFTs 


Synchronization of data transfers via external and internal interrupts 


DMA Functional Description 


11.2 DMA Functional Description 


The DMA coprocessor supports six DMA channels that perform transfers to 
and from anywhere in the ’C4x memory map. 


Each DMA channel is controlled by nine registers that are mapped in the ’C4x 
peripheral address space, as shown in Figure 11-1. The major DMA registers 
are described in Section 11.3. 


The DMA coprocessor has dedicated on-chip address and data buses (see 
Figure 2-8 for a block diagram of the peripherals of the ’C4x). All accesses 
made by the six DMA channels are arbitrated in the DMA coprocessor and take 
place over these dedicated buses. The six DMA channels transfer data in a 
sequential time-slice fashion, rather than simultaneously, because they share 
common buses. 


The DMA channels can run constantly or can be triggered by external 
(IIOF3-0) or internal (on-chip timers and communication ports) interrupts. 


The DMA coprocessor can transfer data in a bit-reversed fashion (for FFT ap- 
plications) or in a linear fashion; it can also transfer matrix data in a row or col- 
umn fashion. 


The DMA coprocessor has two basic operational modes: 


(J Unified Mode: Used for memory-to-memory transfers. The unified mode 
is described in Section 11.4, DMA Unified Mode. The unified block transfer 
sequence is presented in subsection 11.2.1, Block Transfer Sequence. 


(1 Split Mode: Used for two-way, memory-to-communication port transfers. 
The split mode is described in Section 11.5, DMA Split Mode. 
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Figure 11-1. 
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DMA Functional Description 


11.2.1 DMA Basic Operation 


If a block of data is to be transferred from one region in memory to another re- 
gion in memory (unified mode), the following sequence is performed: 


DMA Registers Initialization 


1) The source address register of a DMA channel is loaded with the address 
of the memory location to read from. 


2) The destination address register of the same DMA channel is loaded with 
the address of the memory location to write to. 


3) The transfer counter is loaded with the number of words to be transferred. 


4) The source/destination index register is loaded with the step size of 
source/destination register update. If sequential memory accesses are re- 
quired, the source address index register and the destination address in- 
dex register must be set to 1. 


5) The DMA channel control register is loaded with the appropriate modes 
to synchronize the DMA coprocessor reads and writes with interrupts. The 
DIE register determines which interrupt to use for synchronous transfer. 


DMA Start 


6) The DMA coprocessor is started via the DMA START field in the DMA 
channel control register. 


Word Transfers 


7) The DMA channel reads a word from the source address register and 
writes it to a temporary register within the DMA channel. 


8) After aread by the DMA channel, the source-index register is added to the 
source address register. 


9) Afterthe read operation completes, the DMA channel writes the temporary 
register value to the destination address pointed to by the destination ad- 
dress register. 


10) After the destination address has been fetched, the transfer counter regis- 
ter is decremented and the destination-index register is added to the desti- 
nation-address register. 


7'- MTT 
Note: 


Both of the index registers (Source and destination) contain signed values. 
This allows for variable step sizes or continuous reads from and/or writes to 
memory. When an index register equals zero, the DMA coprocessor trans- 


fers data to or from a fixed location. 
eee | 
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DMA Functional Description 


11) During every data write, the transfer counter is decremented. The block 


transfer terminates when the transfer counter reaches zero and the write 
of the last transfer is completed. The DMA channel sets the transfer count- 
er interrupt (TCINT) flag in the DMA channel control register. 


After the completion of a block transfer, the DMA coprocessor can be pro- 
grammed to do several things: 


Lj 
L} 


a 


Stop until reprogrammed (TRANSFER MODE bits = 019) 
Continue transferring data (TRANSFER MODE bits = 009) 


Generate an interrupt to signal the CPU that the block transfer is complete 
(TCC bit = 19) 


Autoinitialize itself to start the next block transfer (TRANSFER MODE 
bits = 109 or 119). 


Each DMA channel reads new DMA register values from memory, loads these 
values into its register file, and, according to the values loaded, begins another 
block transfer. Whether or not the CPU must initialize transfers is determined 
by the value of the transfer mode bits: 


L 


L 


Autoinitialization under transfer mode bits = 10. is done without any 
intervention by the CPU. 


Autoinitialization under transfer mode bits =115 requires the CPU to start 
the DMA. 


11.3 DMA Registers 


DMA Registers 


Each DMA channel has nine registers designated as follows: 


L) 


L 


Control register: contains the status and mode information about the 
associated DMA channel. 


Source address register: contains the memory address of data to be 
read. 


Source address-index register: contains the step size (a signed 32-bit 
number) used to increment or decrement the source address register. 


Destination address register: contains the memory address where data 
is written. 


Destination address-index register: contains the step size (a signed 
32-bit number) used to increment or decrement the destination address 
register. 


Transfer counter register: contains the block size to move in unified 
mode or in split mode (primary channel). 


Auxiliary transfer-counter register: contains the block size to move in 
split mode (auxiliary channel). 


Link pointer register: contains the memory address of data to autoinitial- 
ize the DMA channel registers. Used for unified mode or primary channel 
in split mode. 


Auxiliary link-pointer register: contains the memory address of data to 
autoinitialize the DMA channel registers. Used for auxiliary channel in split 
mode. 


After reset, the control register, the transfer counter, and the auxiliary transfer 
counter registers are set to zeros and the other registers are undefined. 


11.3.1 Control Register 


The format of the DMA-channel control register is shown in Figure 11-2. The 
text following the figure describes the functions of each field in the register. 


At reset, each DMA-channel control register is set to zero. This makes the 
DMA channels lower-priority than the CPU, sets up the source address and 
destination address to be calculated via linear addressing, and configures the 
DMA channel in the unified mode. 
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Figure 11-2. DMA Channel Control Register 
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Bit may be read. 
Bit may be written. 


RWSA RW RWSA RWSA RWS RWS RWS RWS 


Bit is shadowed during autoinitialization (no changes take place until autoinitialization is complete.) 
Bit is auxiliary for autoinitialization. 


Reserved. 
DMA Channel 0 only 


DMA PRI 


TRANSFER 
MODE 


AUX 
TRANSFER 
MODE 


SYNC MODE 


AUTOINIT 
STATIC 


Sets DMA coprocessor priority. Defines the arbitration rules to be used when 
a DMA channel and the CPU are requesting the same resource. Affects all 
DMA coprocessor modes. The rules are listed in Table 11-1. 


Defines the transfer mode used by the DMA channel. Affects unified mode 
and the primary channel in split mode. The bits are defined in Table 11-2. 


Defines the transfer mode used by the DMA channel. Affects the auxiliary 
channel in split mode only. The bits are defined in Table 11-2. 


Determines the mode of synchronization for performing data transfers, 
These bits work differently in unified and split modes. See Table 11-3 and 
Table 11—4 for bit descriptions for unified and split modes. 


Note: If a DMA channel is interrupt driven for both reads and writes, andthe 
interrupt for the write comes before the interrupt for the read, the interrupt 
for the write is latched by the DMA channel. After the read is complete, the 
write can be executed. 


This bit affects unified mode and the primary channel in split mode. It keeps 
the auxiliary link pointer constant during autoinitialization from the on-chip 
communication ports or other stream-oriented devices (such as first-in first- 
out (FIFO) memory buffers). /f bit=0, the link pointer is incremented during 
autoinitialization. /f bit=7, the link pointer is not incremented (it is static) dur- 
ing autoinitialization. 


AUX 
AUTOINIT 
STATIC 


AUTOINIT 
SYNC 


AUX 
AUTOINIT 
SYNC 


READ 
BIT REV 


WRITE 


BIT REV 


SPLIT MODE 


COM PORT 


DMA Registers 


Acts like the AUTOINIT STATIC bit above, except that it affects the auxiliary 
channel in split mode only. 


This bit has an effect only in the DMA coprocessor sync mode (bits 6—7 
above). It affects the interrupt that is enabled by the DMA interrupt enable 
register (Shown in Figure 11—25) used for DMA reads: /f bit = 0, the interrupt 
is ignored, and the autoinitialization reads are not synchronized with any in- 
terrupt signals. /f bit = 1, then the interrupt is recognized and is also used to 
synchronize the autoinitialization reads. This affects the unified mode and 
the primary channel in split mode (see the SPLIT MODE bit). The effect of 
this bit and the SYNC MODE bit in autoinitialization is summarized in 
Table 11-9. 


Acts the same as the AUTOINIT SYNC bit above, except that it affects the 
auxiliary channel in split mode. The effect of this bit and the SYNC MODE 
bits in autoinitialization is summarized in Table 11-9. 


Selects type of addressing for modifying the source address. /f bit=0, the 
source address is modified using 32-bit linear addressing. /f bit = 7, the 
source address is modified using 24-bit bit-reversed addressing. The bit af- 
fects unified mode and primary channel reads (source) in split mode. 


Selects the type of addressing for modifying the destination address. /f 
bit = 0, the destination address is modified using 32-bit linear addressing. /f 
bit=1, the destination address is modified using 24-bit bit-reversed addres- 
sing. The bit affects unified mode and auxiliary channel writes (destination) 
in split mode. 


This bit controls the DMA coprocessor mode of operation. /f bit = 0, DMA 
transfers are from memory to memory. This is referred to as unified mode. 
If bit = 1, split mode is entered with each DMA channel split into two chan- 
nels, allowing a single DMA channel to perform memory-to-communication- 
port and communication-port-to-memory transfers. The split mode can be 
modified by autoinitialization in unified mode or by autoinitialization by the 
auxiliary channel in split mode. Split mode is further described in Section 
11.4, DMA Split Mode. 


These bits define a communication port (0009 to 1015) to be used for DMA 
transfers. If SPLIT MODE = 0, COM PORT has no affect on the operation 
of the DMA channel. /f SPLIT MODE = 1, COM PORT defines which of the 
six communication ports to use with the DMA channel. The COM PORT may 
be modified by autoinitialization in unified mode or by autoinitialization by the 
auxiliary channel in split mode. 
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TCC 


AUX TCC 


TCINT FLAG 


AUX 


TCINT FLAG 


START 


AUX START 


Transfer counter interrupt control. /f TCC = 7,a DMA channel interrupt pulse 
is sent to the CPU after the transfer counter makes a transition to zero and 
the write of the last transfer is complete. 


If enabled, the corresponding DMA interrupt (DMA INTO-INT5) occurs at the 
vector shown in Figure 7-2. If TCC = 0,a DMA channel interrupt pulse is not 
sent to the CPU when the transfer counter transitions to zero. This bit affects 
unified mode and the primary channel in split mode. 


Auxiliary transfer counter interrupt control. /f bit = 1,a DMA channel interrupt 
pulse is sent to the CPU after the auxiliary transfer counter makes a transi- 
tion to zero and the write of the last transfer is complete. If enabled, the corre- 
sponding DMA interrupt (DMA INTO-INT5) occurs as shown in Figure 7-2. 
If bit = 0,a DMA channel interrupt pulse is not sent to the CPU when the auxil- 
iary transfer counter transitions to zero. This bit affects the auxiliary channel 
in split mode only. 


Transfer counter interrupt flag. This flag is set to 1 whenever the transfer 
counter makes a transition to zero and the write of the last transfer is com- 
pleted. Whenever the DMA channel control register is read, this flag is 
cleared, unless the flag is being set by the DMA in the same cycle as the 
read. The TCINT FLAG is affected by the unified mode and the primary chan- 
nel in split mode. 


Auxiliary transfer counter interrupt flag. This flag is set to 1 whenever the 
auxiliary transfer counter makes a transition to zero and the write of the last 
transfer is completed. Whenever the DMA control register is read, this flag 
is cleared, unless the flag is being set by the DMA coprocessor in the same 
cycle as the read. The AUX TCINT FLAG is affected by the auxiliary channel 
in split mode. Since only one interrupt is available for a DMA channel, you 
can determine what event had set the interrupt by examining the TCINT 
FLAG and the AUX TCINT FLAG. 


Starts and stops the DMA channel in several different ways (as are listed in 
Table 11-5). START affects the unified mode and the primary channel in split 
mode. If they is used to hold a channel in the middle of an autoinit sequence, 
the START and AUX START bits will hold the autoinit sequence. If the 
START or AUX START bits are being modified by the DMA channel (for ex- 
ample, to force a halt code of 109 on a transfer-counter terminated block 
transfer) and a write is being performed by an external source to the DMA 
channel control register, internal modification of the START or AUX START 
bits by the DMA channel has priority. See TRANSFER MODE bits value of 
019 in Table 11-2 for more information. 


Starts and stops the DMA channel in several different ways (as are listed in 
Table 11-5). AUX START affects the auxiliary channel in split mode only. 


STATUS 


AUX STATUS 


PRIORITY 
MODE 


DMA Registers 


Indicates the status of the DMA channel as listed in Table 11-6. STATUS is 
updated in the unified mode and by the primary channel in the split mode. 
Updates are performed every cycle. The STATUS and AUX STATUS bits 
also determine if the DMA channel has halted or has been reset after writing 
to the START or AUX START bits. 


Indicates the status of the DMA channel as listed in Table 11-6. STATUS is 
updated by the auxiliary channel in split mode only. Updates are performed 
every cycle. 


Priority mode of DMA channel access: /f bit = 0, priority rotates as shown in 
Section 11.6. /f bit = 7 priority is fixed as shown in Section 11.6. This bit is 
available only at DMA channel zero. 
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Table 11-1. DMA PRI Bits and CPU/DMA Arbitration Rules 


DMA PRI 
Bit Nos: 
1-0 


00 


01 


10 
11 


Description 


DMA coprocessor access is /ower priority than CPU access. If the DMA channel and 
the CPU are requesting the same resource, then the CPU will proceed. These bits are 
set this way at reset. 


This setting selects rotating arbitration, which sets priorities between the CPU and DMA 
channel by alternating their accesses, but not exactly equally. Priority rotates between 
CPU and DMA accesses when they conflict during consecutive instruction cycles. The 
first time the DMA channel and the CPU request the same resource, the CPU has prior- 
ity. If, in the following instruction cycle, the DMA coprocessor and the CPU again re- 
quest the same resource, the DMA has priority. Alternate access continues as long as 
the CPU and DMA requests conflict in consecutive instruction cycles. When there is no 
conflict in a previous instruction cycle, the CPU has priority. 


Reserved. 


DMA coprocessor access is higher priority than CPU access. If the DMA channel and 
the CPU are requesting the same resource, then the DMA will proceed. 


Table 11-2. TRANSFER MODE (AUX TRANSFER MODE) Field Descriptions 


TRANSFER 
MODE 
Bit Nos: 

3-2/(5—4) 


00 


01 


10 


11 


11-12 


Description 


Transfers are not terminated by the transfer counter, and no autoinitialization is 
performed. TCINT (transfer counter interrupt) and AUX TCINT can still be used to 
cause an interrupt when the transfer counter makes a transition to zero. The DMA 
channel continues to run. Note that the address continues to increment while the 
transfer count rolls over to its maximum value of OFFFF FFFFh. 


Transfers are terminated by the transfer counter. No autoinitialization is performed. A 
halt code of 105 is placed in the START (or AUX START) field when transfers are 
completed. 


Autoinitialization is performed when the transfer counter goes to zero without waiting 
for CPU intervention. 


The DMA channel is autoinitialized when the CPU restarts the DMA coprocessor by 
using the DMA register in the CPU. When the transfer counter goes to zero, 
operation is halted until the CPU starts the DMA coprocessor by using the START 
(AUX START) field in the DMA channel control register (bits 22-23 and 24-25, 
Table 11-5). A halt code of 109 is placed in the START (or AUX START) field by the 
DMA coprocessor. 
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Table 11-3. SYNC MODE Field Descriptions in Unified Mode 


SYNC MODE 
Bit Nos: 
7-6 


00 
01 


10 


11 


Description 


No synchronization. Interrupts are ignored, see Figure 11-27. 


Source synchronization. A read is not performed until an enabled interrupt occurs 
(see Figure 11—-28a). The interrupt is specified by the DMAx READ field of the DMA 
interrupt enable (DIE) register (see subsection 11.10.1, Interrupts and Synchroniza- 
tion of DMA Channels, for more information). 


Destination synchronization. A write is not performed until an enabled interrupt oc- 
curs (see Figure 11—29a). The interrupt is specified by the DMAx WRITE field of the 
DMA interrupt enable (DIE) register (subsection 11.10.1, Interrupts and Synchroniza- 
tion of DMA Channels, for more information). 


Source and destination synchronization. A read is performed when an enabled inter- 
rupt (specified by the DMAx READ field) occurs. Then, a write is performed when an 
enabled interrupt (specified by the DMAx WRITE field) occurs (as shown in 

Figure 11-30). These fields are part of the DMA interrupt enable (DIE) register (see 
subsection 11.10.1, Interrupts and Synchronization of DMA Channels, for more in- 
formation). 


Table 11-4. SYNC MODE Field Descriptions in Split Mode 


SYNC MODE 
Bit Nos: 
7-6 


00 
01 


10 


11 


Description 


No synchronization. Interrupts are ignored see Figure 11-27. 


Destination synchronization. A primary channel write to the communication-port out- 
put FIFO is not performed until an enabled interrupt occurs (see Figure 11—-29b). The 
interrupt is specified by the DMAx PRIMARY WRITE field of the DMA interrupt en- 
able (DIE) register (see subsection 11.10.1, /nterrupts and Synchronization of DMA 
Channels, for more information). 


Source synchronization. An auxiliary-channel read from the communication-port in- 
put FIFO is not performed until an enabled interrupt occurs (see Figure 11—-28b). The 
interrupt is specified by the DMAx AUXILIARY READ field of the DMA interrupt en- 
able (DIE) register (see subsection 11.10.1, Interrupts and Synchronization of DMA 
Channels, for more information). 


Source and destination synchronization. A read from the communication-port input 
FIFO is performed when an enabled interrupt (specified by the DMAx AUXILIARY 
READ field) occurs. A write to the communication port output FIFO is performed 
when an enabled interrupt (specified by the DMAx PRIMARY WRITE field) occurs. 
These fields are part of the DMA interrupt enable (DIE) register (see subsection 
11.10.1, Interrupts and Synchronization of DMA Channels, for more information). 
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Table 11-5. START (AUX START) Field Descriptions 


START (AUX START) 
Bit Nos: 
23 — 22 
(25 — 24) 


00 


01 


10 


11 


Description 


DMA channel reset. DMA-channel read or write cycles in progress are completed 
(not aborted); any data read is ignored. Any pending (not started) read or write is 
canceled. The auxiliary (AUX START =005) and primary (START=005) transfer 
counters are set to zero. The DMA channel is reset so that when it starts, a new 
transaction begins; that is, a read is performed. In this mode, stopping is immediate 
with no other registers loaded. 


DMA halt on read or write boundary. Halts the DMA channel on the first available 
read or write boundary. If a read or write has begun, the read or write is completed 
before stopping. If a read or write has not begun, no read or write is started. In this 
mode, stopping is immediate with no other registers loaded). 


DMA halt on transfer boundary. Halts the DMA channel on the first available transfer 
boundary. If a DMA transfer has begun, the entire transfer is completed, including 
both cycles (both read and write operations), before stopping. If a transfer has not 
begun, none is started. In this mode, stopping is immediate with no other registers 
loaded. This is also the value after a DMA transfer completes. 


DMA start. Writing 11 to this field starts the DMA process using the values in the 
channel’s DMA channel registers (Figure 11-1). If the DMA is in autoinitialization, all 
DMA registers are loaded before starting the operation. The DMA coprocessor starts 
from reset if previously reset (START or AUX START bits = 005) or restarts from the 
previous state if previously halted (START or AUX START bits = 019 or 105). 


Table 11-6. STATUS (AUX STATUS) Field Descriptions 


STATUS (AUX 
STATUS) 
Bit Nos: 
27 — 26 
(29 — 28) 


00 


01 


10 
11 
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Description 


The DMA channel is held on the boundary of the DMA transfer (the write is com- 
plete, and the read has not begun). This is the value at RESET after a halt ona 
transfer boundary or after a block transfer. 


The DMA channel is being held in the middle of a DMA transfer; (the read is com- 
plete, and the write has not begun). This occurs only if the START (or AUX START) 
field = O19. 


Reserved. 


The DMA channel is not being held or reset. 


DMA Registers 


11.3.2 Address and Index Registers 


As shown in Figure 11—3, both the DMA coprocessor source-address and des- 
tination-address registers have an associated index register. After each DMA- 
channel read (Source address) or write (destination address), the correspond- 
ing (source or destination) address generator adds the index register to the ad- 
dress register and places the result in the address register. In this way, the ad- 
dress register acts as an accumulator because it retains its own sum and the 
sum of its index register, as is shown by the following equation: 


Address Register + Index Register — Address Register 
The values in these registers are undefined at reset. 


Depending upon bits 12 and 13 (READ BIT REV and WRITE BIT REV) of the 
DMA channel control register, the addition may be either: 


_j Linear (normal addition): READ BIT REV = 0 or WRITE BIT REV = 0, or 


_j Bit reversed (reverse carry propagation): READ BIT REV = 1 or WRITE 
BIT REV = 1. 


Both index values (source or destination) are signed values. 
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Figure 11-3. DMA Coprocessor Address Generation 
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Read bit-reverse bit 
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‘ 


Source index 0 


Source index 1 


Source index 2 
Source index 3 
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(a) Source address register operation 


Dest. address 0 


Dest. address 1 


Dest. address 2 
eH 


Dest. address 4 


Dest. address 3 
Dest. address 5 


Write bit-reverse bit 


Dest. address generator 


v 


index 0 


index 1 


index 2 


index 3 
index 4 


index 5 


(bo) Destination address register operation 


11.3.3 Transfer Counter and Auxiliary Transfer Counter Registers 


These registers contain the number of words to be transmitted. 


Figure 11-4 shows the six transfer counters and the six auxiliary transfer 
counters. ADMA channel in split mode (described in Section 11.4, DMA Split 
Mode) uses the auxiliary transfer counter for the auxiliary channel and the pri- 
mary transfer counter for the primary channel. The values in these registers 
are set to zero at reset. 


The counters are decremented after completing the address fetch for the write 
portion of a transfer. The TCINT FLAG and AUX TCINT FLAG (bits 20 and 21 
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of the DMA channel control register, as shown in Figure 11—2) are not set until 
the counter is decremented and the write of the last transfer is completed. Cor- 
respondingly, the interrupt will not be seen by the CPU interrupt controller until 
the transfer counter is decremented and the write of the last transfer is com- 
pleted. 


The decrementer checks whether the transfer counter equals zero after the 
decrement is performed. As a result, if the counter register has a value of 1, 
then the DMA channel can be halted after only one transfer is performed. Thus, 
by setting the transfer counter to 1, the DMA channel transfers the minimum 
possible number of words (1 time). The countis treated as an unsigned integer. 
Transfers can be halted when a zero count is detected after a decrement. If 
the DMA coprocessor channel is not halted after the transfer reaches zero, the 
counter will continue decrementing below zero. Thus, by setting the transfer 
counter to zero, the DMA channel transfers the maximum possible number of 
words (10000 0000h times). 


Figure 11-4. Transfer Counter Registers 


Decrementer 


Transfer counter x T 


Auxiliary transfer counter x T 


t+ x = DMA channel number (0-5) 


11.3.4 Link Pointer and Auxiliary Link-Pointer Registers 


The link pointers specify the address from which to load the new DMA channel 
register values when autoinitialization is performed. When a channel has ex- 
hausted its counter (transfer counter = 0), it will (if appropriately configured) 
use the link pointer to reload itself. Figure 11-5 illustrates the DMA coproces- 
sor link address registers. The values in these registers are undefined at reset. 
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For example, under autoinitialization, the steps to load the channel registers 
for DMA channel 0 (as shown in Figure 11—1) are: 


1) Getthe link pointer for the next DMA operation. The pointer is the memory 
address containing the contents of the first DMA channel 0 register (the 
channel control register as shown in Figure 11-1). 


2) Bring in the contents pointed to by the pointer and write to address 
0010 OOAOh (first word of DMA channel 0 registers as shown in 
Figure 11-1). 


3) Increment the link pointer. (Skip this step if the AUTOINIT STATIC bit = 1.) 
4) Bring in the next word and write to address 0010 OOAth. 


5) Repeat until the entire block of registers is loaded for DMA channel 0 (7 
registers in unified mode; 5 registers in split mode). 


Figure 11-5. Link Pointer Registers 
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11.4 DMA Unified Mode 


Unified mode is the default DMA operational mode. It is used for memory-to- 
memory transfers. To select unified mode, clear the SPLIT MODE bit (bit 14 
of the DMA channel control register, which is shown in Figure 11-2). Thus, 
write a zero to this bit (zero is the reset value of this bit). 


The block transfer sequence under unified mode is covered in subsection 
11.2.1. DMA channel arbitration in unified mode is described in Section 11.6. 
DMA synchronization with interrupts is covered in Section 11.10, DMA and In- 
terupts. Autoinitialization in unified mode is covered in subsection 11.9.1, Uni- 
fied Mode. 


A unified DMA word transfer consists of two steps, as shown in Figure 11-6: 


1) The DMA channel reads the source data value from the address pointed 
to by the source address register and stores it in a temporary register. 


2) The DMA channel reads the temporary register value and writes it to the 
address pointed to by the destination address register. 


You can use unified mode to perform communication port transfers, especially 
unidirectional transfers. Using split mode is more advantageous in bidirection- 
al transfers. 


Figure 11-6. Typical Unified-Mode DMA Channel Configuration 
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11.5 DMA Split Mode 
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The DMA split mode (see Figure 11—7) allows one DMA channel to be used 
for both reading and writing data to a communications port. Split mode essen- 
tially transforms one DMA channel into two DMA channels: 


(j Primary Channel: dedicated to reading data from a location in the 
memory map (external/internal) and writing it to a communication port out- 
put FIFO. 


( Auxiliary Channel: dedicated to receiving data from a communication 
port input FIFO and writing it to a location in the memory map. 


To select split mode, set the SPLIT MODE bit (bit 14 of the DMA channel con- 
trol register, Figure 11-2) to one. 


All six DMA channels support this split mode to accommodate all of the com- 
munication ports. The COM PORT field (bits 15-17 as shown in Figure 11-2) 
of the DMA channel control register defines which communication port is used 
(port 0-5). ADMA channel in split mode can be used with any communication 
port; however, read/write synchronization is restricted to signals from the com- 
munication port with the same number as the DMA channel being used; in oth- 
er words, DMA/can synchronize only with signals coming from communication 
port / (see Section 11.10, DMA and Interrupts, for more information). 
Figure 11—7 shows typical split mode operation with one communication port. 


A split mode word transfer is similar to that of the unified mode except for the 
following differences: 


(1 The primary channel reads a word from the address pointed to by the 
source address register and writes it to a temporary register within the 
DMA coprocessor. It then writes the temporary register value to the output 
FIFO on the communication port specified in the COM PORT field. The 
registers that control the primary channel are the DMA channel control 
register, source address register, source index register (added to source 
address register), transfer-counter register, and link pointer register. 


(1 ~The auxiliary channel reads a word from the input FIFO on the commu- 
nication port specified in the COM PORT field and writes it to a temporary 
register within the DMA coprocessor. It then writes the temporary register 
value in the address pointed to by the destination address register. The 
registers that control the auxiliary channel are the DMA channel control 
register, destination address register, destination index register (added to 
the destination address register), auxiliary transfer-counter register, and 
auxiliary link pointer register. 


DMA Split Mode 


DMA channel arbitration in split mode is described in subsection 11.6.3, Split 
Mode and DMA Channel Arbitration. DMA synchronization with interrupts is 
covered in Section 11.10, DMA and Interrupts. Autoinitialization in split mode 
is covered in subsection 11.9.2, Split Mode. 


Figure 11—7. Typical Split-Mode DMA Configuration 
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Notice that there is only one temporary register in each DMA channel. There- 
fore, a primary channel operation must complete before an auxiliary channel 
operation can begin, and vice versa. 


Primary and auxiliary channels share some of the DMA channel control regis- 
ters and exclusively use others: 


[1 PRIORITY MODE, COM PORT, SPLIT MODE, and DMA PRI are fields 
that both primary and auxiliary channels use. 


[J AUX STATUS, AUX START, AUX TCINT flag, AUX TCC, WRITE BIT REV, 
SYNC MODE (bit 7), and AUX TRANSFER MODE are used exclusively 
by the auxiliary channel. 


Li STATUS, START, TCINT flag, TCC, READ BIT REV, SYNC MODE (bit 6), 
and TRANSFER MODE are used exclusively by the primary channel. 
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11.6 DMA Internal Priority Schemes 


Because all accesses made by the six DMA channels take place over one 
common internal DMA data and address bus, a priority scheme for bus arbitra- 
tion is required. Within the DMA coprocessor, two priority schemes are used 
to designate which channel is serviced next: 


Lj A fixed priority scheme with channel 0 always having the highest priority 
and channel 5 the lowest. 


Lj) A rotating priority scheme that places the most recently serviced channel 
at the bottom of the priority list (default setup after reset). 


11.6.1 Fixed Priority Scheme 


This scheme provides a fixed (unchanging) priority for each channel as fol- 
lows: 
Highest priority 


0 
1 
2 
3 
4 
Lowest priority 5 


To select fixed priority, set the PRIORITY MODE bit (bit 30) of channel 0’s 
DMA-channel control register to 1 (one). 


11.6.2 Rotating Priority Scheme 
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In arotating priority scheme, the last channel serviced becomes the lowest pri- 
ority channel. The other channels sequentially rotate through the priority list 
with the lowest channel next to the last-serviced channel becoming the highest 
priority on the following request. The priority rotates every time the channel 
most recently granted priority completes its access. Figure 11-8 and 
Figure 11-10 illustrate the rotation of priority across several DMA coprocessor 
accesses. At system reset, the channels are ordered from highest to lowest 
priority (0, 1, 2, 3, 4, 5). 


To select this scheme, set the PRIORITY MODE bit (bit 30) of channel 0’s 
DMA control register to 0 (zero). 


DMA Internal Priority Schemes 


Figure 11-8. Rotating Priority Mode Example of the DMA Coprocessor 


1st Service 2nd Service 3rd Service 4th Service 
Highest priority 0 3 15 # service — 0 
1 ; t4 ¢service— 0 1 
T2 @service— t5 1 t2 
3 0 +2 3 
+4 1 3 t4 
Lowest priority bs Lpt2 Lt4 p> 19 


TDMA channel requesting an access 


Each service is one read access or one write access. See 
Figure 11-9 for an example of a read/write sequence. 


At the start of the example in Figure 11-8, channels 2, 4, and 5 are requesting 
service. Because channel 2 has the highest priority, it is serviced first. It then 
becomes the lowest priority channel. The highest priority channel then be- 
comes channel three. On the following services, channels 4 and 5 are taken 
care of in a similar fashion. Figure 11-9 shows the entire read and write se- 
quence. 


Ooo 


Note: 


Each service means one read access or one write access. The DMA 
coprocessor handles channel arbitration on an access-by-access basis; that 
is, aDMA channel must contend for both the read and the write access in 
both unified and split modes. 


Figure 11-9. Rotating Priority DMA Read and Write Sequence Example (Unified Mode) 


1st 2nd 3rd 4th 5th 6th 7th 
Service Service Service Service Service Service Service 
0 3 15 «DMA oO 3 15 «DMA 

DMA R DMA Ww 

Toma 0 1 oma 14 0 1 
ie oy 18 Ri 4 12 ey) 45 Wl 4 +2 
3 0 +2 3 0 +2 3 
+4 1 3 +4 1 3 +4 
t5 l» t2 ly t4 ly t5 L»t2 ly t4 ly t5 


t DMA channel requesting an access 


Figure 11-10 shows the same results in a different way as in Figure 11-8 in 
a rotating priority scheme. Priority decreases from highest to lowest in a clock- 
wise direction. The priority rotates in a counter clockwise direction with the 
most recently serviced channel becoming the lowest in priority. 
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Figure 11—-10.Example of a Priority Wheel 
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priority 
channel < Xi 7 Ne 
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‘Of O42 ©, 4 0. 
at ar / 1 St 3 1 4 2 
Lowest 3 0 2 3 
priority 
channel 


tT DMA channel requesting an access 


With the rotating priority scheme, any DMA channel requesting service is guar- 
anteed to be recognized after a number of higher priority requests have been 
serviced. The maximum number of requests are: 


Lj Five in unified mode 
Lj Eleven in split mode 


This provides a way of preventing a channel from monopolizing the system. 


DMA channels that are running and are not synchronized via interrupts are al- 
ways requesting service. 


11.6.3 Split Mode and DMA Channel Arbitration 
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When a DMA channel is running in split mode, arbitration between channels 
is similar to rotating priorities. A split-mode DMA channel has the same priority 
as a unified DMA channel. The only issue is how to arbitrate between the pri- 
mary split channel and the auxiliary split channel. The split channels alternate 
priorities via a rotating priority scheme. 


When a DMA channel is in split mode and both paths are simultaneously 
started via the START and AUX START bits, the output (primary) channel has 
priority over the input (auxiliary) channel. Both the START and AUX START 
bits must be written at the same time in order to achieve this reset condition. 


DMA Internal Priority Schemes 


The priority scheme for split mode channels is slightly different from the 
scheme for unified mode channels: 


a 


a 


For unified channels, the priority changes after a read or a write. 


For the primary and auxiliary channels within a split channel, priority 
changes after a complete read and write. This is because there is only one 
temporary register for both DMA channels (primary and auxiliary) to store 
the read value. 


Figure 11-11 shows two channels contending for the DMA bus: channel 2 (a 
split channel) and channel 4. 


Figure 11-11. Example of a Channel Priority Scheme in Split Mode 


Highest priority channel 0 


Lowest priority channel 5 


TDMA channel requesting an access 

+Split channels requesting access 

2pri = the primary split channel of channel 2 
2aux = the auxiliary split channel of channel 2 


The channel priority scheme in Figure 11-11 is shown sequentially in 
Figure 11-12. In words, the scheme follows eight steps: 


1) 


The first service is a request by the primary split channel of channel 2 
(2pri). 2pri reads, and then channel 2 is moved to the lowest priority level, 
but 2pri remains the higher priority channel of channel 2. 


On the second service, channel 4, now a higher priority than channel 2, 
reads its source address and becomes the lowest priority. 


On the third service, the value read by 2pri is written to its destination ad- 
dress, and channel 2 is moved to the lowest priority level. Also, 2pri is 
moved to a lower priority than 2aux. Note that the split channel that just 
completed a read retains a higher priority than the other split channel until 
the data is written to the destination address. 


On the fourth service, the value read by channel 4 in service 2 is now writ- 
ten to its destination address and the channel becomes the lowest priority. 


The DMA Coprocessor 11-25 


DMA Internal Priority Schemes 


Figure 11-12. Service Sequence for Split Mode Priority Example 


1st 2nd 3rd 4th 
Service Service Service Service 
0 3 5 3 
1 t4 DMA 0 t4 ‘ DMA 
#[2pri ¢ DMA a TA W 
2aux] R 0 +[2pri DMA 0 
3 1 2aux] Ww 1 
t4 +[2pri 3 #[2aux ¥v 
5 > 2aux] t4 2pri] 
ad > 
<¢ 
5th 6th 7th 8th 9th 
Service Service Service Service Service 
5 3 5 
y 0 t4@DMA 9 t4 g DMA 0 
1 5 R 1 5 Ww 1 
#[2aux . DMA 0 #f2aux _DMA 0 +[2pri 
2pri] R 1 pate w 1 2aux] 4 
3 #[2aux 3 *[2pri 3 DMA 
t 2pri 2aux t4 R 
4 pri] re, = ] a | 
OMe channel requesting an access Repeat 
Split channels requesting access Sequence 


2pri = the primary split channel of channel 2 
2aux = the auxiliary split channel of channel 2 


5) Inthe fifth service, 2aux is read and channel 2 becomes the lowest priority. 


6) On the sixth service, channel 4 is read again, and it becomes the lowest 
priority. 


7) On the seventh and eighth services, the 2aux and channel 4 values that 
were read in services 5 and 6 are now written to their destination address- 
es. After the channel is written, it assumes the lowest priority. 


8) Inthe ninth service, 2pri is read again as in the first service, and the read/ 
write cycle continues as begun in the first service. 
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11.7 CPU and DMA Coprocessor Arbitration 


The DMA coprocessor transfers data on its own internal buses. Arbitration is 
necessary only when aresource conflict exists between the DMA coprocessor 
and the CPU. The arbitration causes no delay. When there is no conflict, the 
CPU and DMA coprocessor accesses proceed in parallel. 


All arbitration between the CPU and the DMA coprocessor is on an access ba- 
sis; that is,the DMA coprocessor must contend for read and write accesses in 
both unified and split modes. DMA coprocessor internal memory access starts 
during H3 (See Section 8.4, Clocking of Memory Accesses, on page 8-19, for 
more information). 


When the CPU and DMA coprocessor request the same resource, the DMA 
channel’s DMA PRI bits (bits 0 and 1 of the channel control register) define the 
arbitration rules (as shown in Table 11-7). The CPU has higher priority than 
the DMA when DMA PRI=009; it has lower priority than the DMA when DMA 
PRI = 119. They rotate priority when DMA PRI = 01>. 


Table 11-7. DMA PRI Bits and CPU/DMA Arbitration Rules 


DMA PRI 
(Bits 1-0) Description 


00 DMA access is lower priority than the CPU access. If the DMA chan- 
nel and the CPU are requesting the same resource, then the CPU 
will proceed. (DMA PRI bits are set to 005 at reset.) 


01 This setting selects rotating arbitration, which sets priorities between 
the CPU and DMA channel by alternating their accesses, but not 
exactly equally. Priority rotates between CPU and DMA accesses 
when they conflict during consecutive instruction cycles. The first 
time the DMA channel and the CPU request the same resource, the 
CPU has priority. If, in the following instruction cycle, the DMA co- 
processor and the CPU again request the same resource, the DMA 
has priority. Alternate access continues as long as the CPU and 
DMA requests conflict in consecutive instruction cycles. When there 
is no conflict in a previous instruction cycle, the CPU has priority. 


10 Reserved 

11 DMA access is higher priority than the CPU access. If the DMA 
channel and the CPU are requesting the same resource, the DMA 
will proceed. 
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11.8 Data Transfer Modes 


Each DMA channel can operate in four types of data transfer modes. These 
modes differ in: 


Lj Whether or not they use autoinitialization 
[1 How they operate if autoinitialization is in effect or not 


Table 11-8 and the following paragraphs describe these data transfers. 


Table 11-8. TRANSFER MODE (AUX TRANSFER MODE) Field Descriptions 


TRANSFER MODE 
(AUX TRANSFER 
MODE) 

Bits 3-2 (5-4) 


00. 


015 


10. 


1 1o 


Description 


Transfers are not terminated by the transfer counter. No autoinitialization is per- 
formed. The TCINT (transfer count interrupt) bits can still be used to cause an 
interrupt when the transfer counter makes a transition to zero. The DMA channel 
continues to run. 


Transfers are terminated by the transfer counter. No autoinitialization is per- 
formed. A halt code of 109 is placed in the START or AUX START field (bits 
22-23 or bits 24—25 of the DMA channel control register) when transfers are com- 
plete. 


Autoinitialization 1. Autoinitialization is performed when the transfer counter goes 
to zero without waiting for CPU intervention. 


Autoinitialization 2. The DMA channel is autoinitialized when the CPU restarts the 
DMA coprocessor by using the DMA channel control register in the CPU. When 
the transfer counter goes to zero, operation is halted until the CPU starts the DMA 
coprocessor by using the START (or AUX START) field in the DMA channel con- 
trol register. A halt code of 109 is placed in the START (or AUX START) field by 
the DMA. 


11.8.1. Running in TRANSFER MODE = 005 
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When TRANSFER MODE = 00s, transfers are not terminated when the trans- 
fer counter goes to zero, and no autoinitialization is performed. Even though 
the transfer counter does not halt transfers, an interrupt can be generated on 
the transfer counter transition to zero, setting the TCINT FLAG bit to 1. If the 
DMA coprocessor channel is not halted after the transfer reaches zero, the 
counter will continue decrementing below zero. 


Data Transfer Modes 


11.8.2 Running in TRANSFER MODE = 015 
When TRANSFER MODE = 01g, transfers are terminated when the transfer 
counter goes to zero, and no autoinitialization is performed. When the transfer 
counter goes to zero, the DMA channelis halted by forcing 10 into the START 
or AUX START field. 


11.8.3 Running in TRANSFER MODE = 105 (Autoinitialization 1) 


This transfer mode allows the DMA channel to run continuously, change point- 
ers and synchronization by the autoinitialization procedure, and turn itself off. 
Two different autoinitialization methods are supported: 

Autoinitialization method 1a always starts after a system reset, aftera DMA 
channel is reset (009 written to the START or AUX START bits), or aftera DMA 
channel halts (019 or 10. written to START or AUX START bits). To select 
transfer mode 10> (autoinitialization method 1a), follow the steps listed here 
and shown in Figure 11-13. 

1) Initialize the DMA control register to transfer mode 109, and reset or halt 

the DMA channel to be autoinitialized. 


2) Initialize the transfer counter to 0 (resetting the DMA channel does this). 


3) Initialize the DMA channel link pointer with the address where the autoin- 
itialization values reside. No initialization of the other DMA channel regis- 
ters is required, because they are automatically set up during the autoin- 
itialization process. 


4) Start the DMA channel by writing 119 to the START (or AUX START) bits. 


5) The DMA channel performs the sequence, autoinitialize and block trans- 
fer. 


Figure 11—-13.DMA Channel Running in Transfer Mode 105 (Autoinitialization Method 1a) 


DMA is reset or halted and transfer counter= 0 


CPU initializes DMA link 
pointer and control register 


Start DMA channel 


i} 


DMA coprocessor autoinitializes 


DMA channel performs block transfers. 
Reads and writes can be synchronized. 
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Autoinitialization method 1b starts when the transfer counter is not zero. 
The DMA starts a regular DMA transfer and autoinitializes after this transfer 
completes (when the transfer counter becomes zero). To select transfer mode 
10> (autoinitialization method 1b), follow the steps listed here and shown in 
Figure 11-14. 


1) 


2) 


Initialize the DMA control register to transfer mode 105 , and reset or halt 
the DMA channel for the first transfer operation. 


Initialize all the other DMA channel registers (Source address, destination 
address, transfer counter, etc.) according to the transfer operation de- 
sired. Note that the transfer counter now reflects the number of words to 
be transferred (normally a nonzero value) before the autoinitialization pro- 
cess. 


Initialize the DMA channel link pointer with the address where the autoin- 
itialization values for subsequent transfer operations reside. 


Start the DMA channel by writing 115 to the START (or AUX START) bits. 


The DMA channel performs this sequence: block transfer and autoinitial- 
ize (reverse order of method 1a). 


Note that ifa DMA channel is programmed to perform nblock transfers, autoin- 
itialization method 1a requires n DMA autoinitialization values. Autoinitializa- 
tion method 1b requires only n-1 autoinitialization values because the first 
transfer can be accomplished during the initial DMA transfer. This represents 
some memory saving, but successive identical DMA operations require extra 
CPU cycles to set the initial DMA registers values again. 


Figure 11-14. DMA Channel Running in Transfer Mode 10> (Autoinitialization Method 1b) 


11-30 


DMA is reset or halted 


CPU initializes DMA registers 
and DMA link pointer 


Start DMA channel 


> 
Vv 


DMA channel performs block transfers. 
Reads and writes can be synchronized. 


DMA coprocessor autoinitializes 


Data Transfer Modes 


11.8.4 Running in TRANSFER MODE = 119 (Autoinitialization 2) 


This transfer mode, besides having all of the advantages of autoinitialization, 
allows the CPU to coordinate its operation very easily with the operation of the 
DMA channels. Two different autoinitialization methods are supported: 


Autoinitialization method 2a always starts after a system reset, aftera DMA 
channel reset (009 written to the START or AUX START bits), or after a channel 
halts (015 or 105 written to the START or AUX START bits). To select transfer 
mode 115 and use autoinitialization method 2a, follow the steps listed here and 
shown in Figure 11-15. 


Initialize the DMA control register to transfer mode 115 and reset or halt 
the DMA channel to be autoinitialized. 


Initialize the transfer counter to O (resetting the DMA channel does this). 


Initialize the DMA channel link pointer with the address where the autoin- 
itialization values reside. No initialization of the other DMA channel regis- 
ters is required, because they are automatically set up during the autoin- 
itialization process. 


Start the DMA channel by writing 115 to the START or AUX START bits. 
The DMA channel autoinitializes itself and performs a block transfer. 


When the transfer counter goes to zero, the DMA waits for the CPU to write 
a115to the START(or AUX START) field of the DMA channel control regis- 
ter and autoinitialize. 


Repeat the sequence autoinitialize, transfer, and wait. 


When the transfer counter goes to zero, you can halt the DMA channel by 
forcing 109 into the START (or AUX START) field. 


Figure 11-15.DMA Channel Running in Transfer Mode 115 (Autoinitialization Method 2a) 


DMA is reset or halted and transfer counter = 0 
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Autoinitialization method 2b starts when the transfer counter is not zero. 
The DMA starts with a regular DMA transfer and autoinitializes after this trans- 
fer completes (when the transfer counter becomes zero). To select transfer 
mode 119 and use autoinitialization mode 2b, follow the steps listed here and 
shown in Figure 11-16. 


1) 


2) 


6) 


Initialize the DMA control register to transfer mode 115 and reset or halt 
the DMA channel for the first transfer operation. 


Initialize the other DMA channel registers (Source address, destination ad- 
dress, transfer counter, etc.) accordingly. Note that the transfer counter 
now reflects the number of words to be transferred (normally a nonzero 
value) before the autoinitialization process. 


Initialize the DMA channel link pointer with the address where the autoin- 
itialization values for subsequent transfer operations reside. 


Start the DMA channel by writing 115 to the START( or AUX START) bits. 


The DMA channel performs the initial block transfer. When the transfer 
counter goes to zero, the DMA waits for the CPU to write a 119 to the 
START or AUX START field of the DMA channel control register and auto- 
initialize. 


Repeat the sequence transfer, wait, and autoinitialize,. 


Note that if a DMA channel is programmed to perform n block transfers, using 
autoinitialization method 2a requires n DMA autoinitialization values. Autoin- 
itialization method 2b requires only n—-1 autoinitialization values because the 
first transfer can be accomplished during the initial DMA transfer. This repre- 
sents some memory saving, but successive identical DMA operations require 
extra CPU cycles to set the initial DMA register values again. 


Data Transfer Modes 


Figure 11-16.DMA Channel Running in Transfer Mode 115 (Autoinitialization Method 2b) 


DMA is reset or halted 


CPU initializes DMA registers and 
DMA link pointer 


v 


Start DMA channel 


> 


DMA channel performs block transfers. 
Reads and writes can be synchronized 


DMA channel waits for CPU to start it 
DMA coprocessor autoinitializes 
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11.9 Autoinitialization 
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Autoinitialization is a method for reloading a DMA channel register file when 
the transfer counter goes to zero. When the DMA channel is operating in 
autoinitialization mode, the link pointer register and auxiliary link pointer regis- 
ter are used to initialize the registers that control the operation of the DMA 
channel. These pointers are memory address locations for blocks of data that 
are to be loaded into the DMA register file, shown in Figure 11-1. Link pointers 
are covered in subsection 11.3.4, Link Pointer and Auxiliary Link—Pointer Reg- 
isters. 


Autoinitialization is a regular DMA block transfer operation in which the des- 
tination is the DMA coprocessor’s register file. The DMA reads the value 
pointed to by the link pointer and writes it to the DMA register over the periph- 
eral bus on the next available cycle. Consequently, autoinitialization read/write 
accesses are also subject to any normal CPU/DMA access conflict. 


Autoinitialization can happen: 


(1 Without CPU intervention when the TRANSFER MODE bits = 102 (autoin- 
itialization 1). Refer to subsection 11.8.3, Running in TRANSFER MODE 
= 105 (Autoinitialization 1). 


(1 With CPU intervention when the TRANSFER MODE bits = 119 (autoinitial- 
ization 2). Inthis case, the CPU should restart the DMA channel before the 
autoinitialization proceeds. Refer to subsection 11.8.4, Running in 
TRANSFER MODE = 110 (Autoinitialization 2). 


Lj Before any block transfer (autoinitialization method a). The DMA starts 
with the transfer counter at zero, then autoinitializes and performs a block 
transfer. 


(J After a block transfer (autoinitialization method b). The DMA starts with a 
regular block transfer, and, when the transfer counter register goes to 
zero, it autoinitializes. 


Autoinitialization 1 or 2 can use methods a or b. 


Autoinitialization depends on the DMA channel’s current mode: split or unified 
mode. The mode of operation is controlled by the SPLIT MODE bit (bit 14 in 
Figure 11-2). When autoinitializing the DMA coprocessor, do not change the 
SPLIT MODE bit. This bit should be changed only when the DMA coprocessor 
has been reset and halted (see DMA START bit description in Table 11-5 for 
more information). 


11.9.1 Unified Mode 


Autoinitialization 


If the DMA channel is running in unified mode (SPLIT MODE = 0), the link 
pointer is used and the DMA-channel registers are loaded in the following order: 


— 


DMA-channel control register 


) 
2) Source-address register 
3) Source-address index register 
4) Transfer-counter register 
5) Destination-address register 
6) Destination-address index register 
7) Link-pointer register 


The storage of new values for these registers in memory is illustrated in 
Figure 11-17. 


Figure 11—17.Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 0) 


11.9.2 Split Mode 


Map of New Register Values in Memory 


Link pointer (+0) —» DMA channel control 


+1 Source address 


+2 Source address index 


+3 Transfer counter 


+4 Destination address 


+5 Destination address index 


+6 Link pointer 


If the DMA channel is running in split mode (SPLIT MODE = 1), then the 
autoinitialize sequence depends upon which counter has terminated. 


If the transfer counter register has gone to zero with SPLIT MODE = 1, then 
the link-pointer register is used for autoinitialization. In this case, the DMA 
channel registers are loaded in the following order: 


) DMA-channel control register 
) Source-address register 

3) Source-address index register 
) Transfer-counter register 

5) Link-pointer register 


The storage of the new values for these registers in memory is illustrated in 
Figure 11-18. 
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Figure 11—18.Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 1 
and Transfer Counter = 0) 


Map of New Register Values in Memory 


Link pointer (+0) —> DMA channel control 


+1 Source address 


+2 Source address index 


+3 Transfer counter 


+4 Link pointer 


If the auxiliary transfer counter register has gone to zero with SPLIT MODE=1, 
then the auxiliary link pointer register is used for autoinitialization. In this case, 
the DMA channel registers are loaded in the following order: 


1) DMA channel control register 

2) Destination address register 

3) Destination address index register 
4) Auxiliary transfer count register 

5) Auxiliary link pointer register 


The storage of the new values of these registers in memory is illustrated in 
Figure 11-19. 


Figure 11-19. Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 1 
and Auxiliary Transfer Counter = 0) 


Map of New Register Values in Memory 


Auxiliary link pointer (+0) —> DMA channel control 


+1 Destination address 


+2 Destination address index 


+3 Auxiliary transfer counter 


+4 Auxiliary link pointer 


11.9.3 Incrementing the Link Pointer 
During autoinitialization, the link pointer can be incremented or held constant: 


Lj When the link pointer is incremented, the autoinitialization values are 
stored in sequential memory locations, and the link pointer or auxiliary link 
pointer is incremented in order to access each of these locations. 
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(1 When you autoinitialize the DMA channel from a stream-oriented device, 
such as the on-chip communication ports or external FIFOs, you should 
hold the link pointer constant. 


This can be controlled by the AUTOINIT STATIC and the AUX AUTOINIT 
STATIC bits of the DMA control register as follows: 


(1 In unified mode, the AUTOINIT STATIC bit controls the link pointer. 


LJ In split mode, the AUTOINIT STATIC bit controls the link pointer (primary 
channel), and the AUX AUTOINIT STATIC controls the auxiliary linker 
pointer. 


When the AUTOINIT STATIC (AUX AUTOINIT STATIC) bit is zero, the link 
pointer is incremented. When it is one, the link pointer is held constant. 


11.9.4 Synchronization 


Usually, autoinitialization data is stored in memory, and synchronization is not 
necessary. In some cases, you may wish to transfer autoinitialization data in 
the same way as in the synchronized data reads and writes. 


Autoinitialization synchronization is a function of the: 


[1 SYNC MODE bits (DMA channel control register bits 6 and 7) that control 
synchronization of data transfers, and 


[1 AUTOINIT SYNC bits (DMA channel control register bits 10 and 11) that 
affect only autoinitialization synchronization. 


If the SYNC MODE bits are not set to synchronize data transfers (i.e., if the 
preceding data transfer is not synchronized on interrupts), then the DMA chan- 
nel autoinitialization sequence is not synchronized either. If the SYNC MODE 
bits are set to transfer data synchronously (if the preceding data transfer is syn- 
chronized), then the upcoming data channel autoinitialization sequence can 
be synchronized on reads or writes or both (depending on whether the DMA 
coprocessor is in unified or split mode) as shown in Table 11-9. Note that when 
both modes show no sync’ for a bit setting in the table, the DMA channel auto- 
initialization sequence is not synchronized on interrupts. 


In unified mode, there is no write synchronization for autoinitialization opera- 
tion, because the destination is the DMA register, which is always ready. 


In split mode, bit 6 of the DMA control register controls the autoinitialization 
synchronization of the DMA primary channel, and bit 7 controls the autoinitial- 
ization synchronization of the DMA auxiliary channel. 
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If primary channel autoinitialization synchronization is used, the DMA read of 
autoinitialization values from memory does not proceed until the interrupt spe- 
cified in the DMAx primary write field in the DIE register is received. 


If auxiliary channel autoinitialization synchronization is used, the DMA read of 
autoinitialization values from memory does not proceed until the interrupt spe- 
cified in the DMAx auxiliary read field in the DIE register is received. 


Table 11-9. Effect of SYNC MODE and AUTOINIT MODE Bits in Autoinitialization 


SYNC MODE 


Bit Numbers 
7-6 


00 
00 
00 


oclUlUCOUmUmUCOUCOF 


AUTOINIT SYNC 


Bit Numbers 
11-10 


0 
0 
1 
1 


0 
0 
1 
1 
0 
0 
1 


1 


0 
0 


0 
1 
0 


Unified Mode 


No synchronization 
No synchronization 
No synchronization 
No synchronization 
No synchronization 
Read 

No synchronization 
Read 

No synchronization 
No synchronization 
No synchronization 
No synchronization 
No synchronization 
Read 

No synchronization 


Read 


11.9.5 Effect on DMA Control Register Bits 
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Split Mode 


No synchronization 
No synchronization 
No synchronization 
No synchronization 
No synchronization 
Primary channel 
No synchronization 
Primary channel 
No synchronization 
No synchronization 
Auxiliary channel 
Auxiliary channel 
No synchronization 
Primary channel 
Auxiliary channel 


Auxiliary and primary chan- 
nels 


In unified mode, all of the writable control register bits are affected by 
autoinitialization. These bits are labeled in Figure 11-20. 


Autoinitialization 


In split mode during autoinitialization of the primary DMA channel, the writable, 
nonauxiliary bits can be modified, but auxiliary bits are protected (as shown 
in Figure 11-21). In other words, only nonauxiliary bits are allowed to be modi- 
fied by the CPU or DMA coprocessor. Also, if the auxiliary DMA channel is 
autoinitialized, the writable auxiliary bits can be modified, but nonauxiliary bits 
are protected. These bits are labeled in Figure 11-22. 


Even though the shadowed bits (designated by sin Figure 11—20) are modified 
during autoinitialization, they do not have an effect until autoinitialization is 
complete. Unshadowed bits take effect immediately, affecting the 
autoinitialization sequence. In other words, at autoinitialization, new shad- 
owed bit values are entered last after all registers are loaded (as specified by 
the link pointer). 


Regardless of whether the DMA channel is running in unified mode or split 
mode, if the CPU or another external source writes to the DMA channel control 
register, this affects all writable bits, including the shadow bits. 


SSaSVeG6GeSq6nN=lCQ*alaQaNNS ee ——es—wo“=$“$09>—=@a“a“oaq*OmayCqewsuqxéyOnmnajTs>SaO09O0M00anaOaoaOanmnNnmnmnOoOoOnmMSamooOwWwWDWDaaSas>n.0 
Note: 


If the CPU writes to the DMA control register during DMA autoinitialization, 
the CPU write takes effect after the autoinitialization sequence completes. 
Even though the autoinitialization operation on the DMA registers is not af- 
fected, the subsequent data transfer may be affected. 


Figure 11-20. DMA Channel Control Register Bits Modifiable by Autoinitialization in 
Unified Mode 


31 30 29 28 «27 26 25 24 23 22 21 20 19 18 
R/W R/W R/W R/W Ss Ss 
17 16 15 


14 13 12 an 10 
SPLIT WRITE BIT READ BIT AUX AUTO | AUTO INIT 
COMIPOHES MODE SYNC SYNC 
Ss Ss Ss Ss Ss 


R/W R/W R/W 


9 8 7 6 5 4 3 2 1 0 
AUX AUTO | AUTO INIT AUX TRANSFER TRANSFER 
STATIC STATIC SIGUA MODE Tobie. | eee 
S) Ss Ss S) Ss S) Ss S S) Ss 
s — These shadowed bits do not take effect until autoinitialization is complete. 
xx — Write protected during autoinitialization. 
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Figure 11-21. DMA Channel Control Register Bit Modifiable by Autoinitialization of the 
Primary Channel in Split Mode 


26 25 24 23 
17 16 15 14 13 12 11 10 
READ BIT AUTO INIT 
Ss s 
9 8 7 6 5 4 3 2 1 0 
AUTO INIT SYNC TRANSFER 
Ss Ss Ss Ss Ss Ss 
s — These shadowed bits do not take effect until autoinitialization is complete. 
xx — Write protected during primary channel autoinitialization. 


Figure 11-22. DMA Channel Control Register Bits That Can Be Modified by 
Autoinitialization of the Auxiliary Channel in Split Mode 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 
R/W R/W S 
17 16 15 


14 13 12 11 10 
SPLIT | WRITE BIT AUX AUTO 
S S s 


R/W RW R/W 


3 
AUX AUTO SYNC AUX TRANSFER eee 
ae ae MODE 
s — These shadowed bits do not take effect until autoinitialization is complete. 
xx — Write protected during auxiliary channel autoinitialization. 


11.9.6 Consecutive Autoinitializations 


For many applications, itis sufficient to autoinitialize the DMA channel with the 
same data each time. In this case, the new link-pointer value points to the start 
of the same block of data containing the new link pointer, as illustrated in 
Figure 11-23. This particular example assumes that the DMA channel is not 
running in split mode. 
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If you want, you can make the new link pointer point to a new set of register 
values, as illustrated in Figure 11-24. This can be continued to any level. 


Figure 11-23. Self-Referential Link Pointer 
Map of New Register Values in Memory 


Link pointer > DMA channel control reg. 


Source address 


Source address index 


Transfer counter 


Destination address 


Destination address index 


Link pointer 


Figure 11-24. Referring to a New Link Pointer 
Map of New Register Values in Memory 


Link pointer -————» DMA channel control 


+1 Source address 


+2 Source address index 


+3 Transfer counter 


+4 Destination address 


+5 Destination address index 


+6 Link pointer 


DMA channel control reg. 


Source address 


Source address index 


Transfer counter 


Destination address 


Destination address index 


Link pointer 


The DMA Coprocessor 11-41 


DMA and Interrupts 


11.10 DMA and Interrupts 


11-42 


The DMA coprocessor uses interrupts in the following way: 


[j Itcan send interrupts to the CPU when a block transfer finishes. See the 
TCC and AUX TCC bits in Figure 11-2. 


[j Itcan receive interrupts from the external interrupt pins (IIOF3-0), the tim- 
ers, or the communication port (ICRDY, OCRDY). 


This section explains how the DMA receives interrupts. This process is called 
synchronization. 


All of the interrupts that the DMA coprocessor can see are first received by the 
CPU interrupt controller. Edge-triggered interrupts are latched by the CPU in 
the appropriate interrupt flag register; level-triggered interrupts are not. 


When an external interrupt (IIOF3—0) is used for DMA coprocessor transfer 
synchronization, the CPU is responsible for configuring external interrupts as 
edge- or level-triggered interrupts (as set in the FUNCx and TYPEx bits of the 
interrupt flag register (discussed in subsection 3.1.10, OF Flag Register 
(IIF)), on page 3-13. 


Edge-triggered interrupts are timer interrupts, DMA interrupts, and external in- 
terrupts that are configured as edge-triggered interrupts. Detailed information 
on interrupts is provided in Section 7.4, Interrupts, on page 7-15, and Section 
7.6, DMA Interrupts, on page 7-26. When the interrupt controller determines 
that an edge-triggered interrupt that a DMA channel is waiting on (DIE regis- 
ters bits set) has been latched into the interrupt flag, the CPU clears the inter- 
rupt flag and sends an interrupt pulse to the DMA channel. The DMA channel 
latches the interrupt locally until it can service the interrupt. At that time, the 
latched interrupt is cleared by the DMA coprocessor for two cycles. 


Level-triggered interrupts generated by communication ports and external in- 
terrupts that are configured as level-triggered interrupts are handled differently 
by the CPU interrupt controller. When the interrupt controller determines that 
a level-triggered interrupt that a DMA channel is waiting for (DIE register bits 
set) has been received, the CPU sends an interrupt pulse to the DMA channel. 
The DMA channel latches the interrupt locally until it can service the interrupt. 
At that time, the locally latched interrupt is cleared by the DMA coprocessor 
for two cycles. 


The interrupt reset signal generated by the DMA coprocessor after a DMA in- 
terrupt is serviced has priority over the interrupt set signal. Thus, the interrupt 
signal will not be continuously set, even if the CPU is continuously sending the 
interrupt set signal. Therefore, when the DMA-set priority scheme is used and 
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a higher priority DMA channel is driven by continuous interrupt signals, the 
lower priority DMA channel can be serviced in between the higher priority DMA 
services. 


Unlike the ’'C3x, the ’C4x DMA processor is not affected by processing the 
CPU interrupts, even when pipeline fetches are being halted. When interrupts 
are enabled in the DIE register, the interrupt is latched automatically by the 
CPU interrupt controller and saved for future DMA use. When a flag interrupt 
(timer, external interrupt) is latched, the IIF flag is cleared. Note that IIF flags 
are cleared when the CPU interrupt controller latches the interrupt, not when 
the DMA responds to it. Even if the DMA has not been started, the interrupt 
latch occurs, except when the start bits in the DMA control register have the 
reset value (009 in the START or AUX START bits). DMA reset clears the inter- 
rupt internal latch. To avoid losing previously received interrupts, it is recom- 
mended that you initialize DIE register after starting the DMA, when the DMA 
start bits have the value 115. Note that when the DMA completes a transfer, 
the start (AUXSTART) bits are set to 109. For this reason, the DMA will not miss 
any interrupt between transfers. 


The DMA and the CPU can respond to the same interrupt if the CPU is not in- 
volved in any pipeline conflict or in any instruction that halts instruction fetch- 
ing. Refer to subsection 7.4.1, Interrupt Vector Table and Prioritization, on 
page 7-15 for more details. It is also possible for different DMA channels (in- 
cluding auxiliary and primary channels) to respond to the same interrupt. If the 
same interrupt is selected for source and destination synchronization, both 
read and write cycles are enabled with a single incoming interrupt. 


The internal circuitry of the ’C4x guarantees proper operation between a com- 
munication port that generates level-triggered interrupts and the DMA channel 
that is synchronizing with those level-triggered interrupts. 


SS ——=_—_Jc____0.) 1 I — — — — — — — — — — — — —“—V_"—MaMaav oo  — — —_—( “aa — — — —_— 00 — a eee 
Note: 


When you synchronize the DMA channels with external interrupts, it is better 
to configure the interrupt lines as edge-triggered interrupts to ensure that 


only one interrupt is recognized. 
a 


11.10.1. Interrupts and Synchronization of DMA Channels 


You can use interrupts to synchronize DMA channel transfers. To set up the 
DMA for a synchronous data transfer mode requires two steps: 


1) Setthe DMA SYNC MODE bits (bits 6,7) in the DMA channel control regis- 
ter to the value for the source, destination, or source and destination syn- 
chronization desired. See subsection 11.10.2, Synchronization Mode Bits, 
for more information. 


The DMA Coprocessor 11-43 


DMA and Interrupts 


2) Set the DIE register to enable the corresponding interrupt for the DMA 
transfer synchronization desired. Figure 11-25 and Figure 11-26 show 
the DIE register for the split and unified modes, respectively. Table 11-10 
and Table 11-11 lists the different synchronization interrupts for unified 
mode, and Table 11-12 and Table 11-13 list them for split mode. 


It is recommended that you initialize the DIE register after starting the 
DMA, when the start bits have the value 119. This prevents losing pre- 
viously received interrupts, which may occur if you enable the DIE register 
when the start bits are 00> (reset value). 


Figure 11-25. DIE Register Bit Functions for DMA Unified Mode 


31 30 29 28 27 26 25 24 23 22 21 20 


DMAS Write DMAS Read DMA4 Write DMA4 Read 


R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W 
19 18 17 16 15 14 13 12 11 10 9 8 


DMAS Write DMAS3 Read DMA2 Write DMA2 Read 


R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W 


7 6 5 4 3 2 1 0 
DMA1 Write DMA1 Read DMAO Write DMAO Read 
R/W R/W R/W R/W R/W R/W R/W R/W 


R = Read W = Write 


Table 11-10. DMA Channels 0 and 1 (DMA0 and DMA) Unified-Mode Synchronization 


Interrupts 
Bit Value Interrupt Enabled at DMAO or DMA1 
(in DMAO 
or DMAO DMAO DMA1 DMA1 
DMA1) Read Write Read Write Interrupt Source for DMA Synchronization 
0 ot None None None None — 


01 ICRDYO OCRDYO  ICRDY1 OCRDY1 From communication port 
10 IIOFO OF 1 IOF2 IIOF3 From external pins IIOFO-IIOF3 
11 TIMO TIMO TIMO TIMO From timer TIMO 


tT DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used. 
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Table 11-11. DMA Channels 2 to 5 (DMA2 to DMA5) Unified-Mode Synchronization 


Interrupts 
— tT 
Bit Value _ erupt nevied alan Diy enabled aL aaa eee Interrupt Source for DMA 
(in DMA2 to DMAS5) DMAx Readt DMAx Writet Synchronization 

00 ot None None -- 
001 ICRDY xt OCRDYxt From communication port 
010 IIOFO IIOFO 
011 OF 1 IIOF1 

From external pins IIOFO-IIOF3 
100 IlOF2 IlOF2 
101 IlOF3 IIOF3 
110 TIMO TIMO 

From timers TIMO and TIM1 
111 TIM1 TIM1 


T The xin DMAx represents the DMA channel number and also the number for the corresponding ICRDY x and OCRDY x inter- 
rupts. For example, an0019in both DMA2 READ and DMA5 WRITE would enable interrupts ICRDY2 and OCRDY5, respective- 
ly. All other viable bit values (0109 to 1119) are the same (as shown in the table) for DMA2 through DMAS. 

+ DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used. 


Figure 11-26. DIE Register Bit Functions for DMA Split Mode 


31 30 29 28 27 26 25 24 23 22 21 20 


DMAS Primary Write DMAS Auxiliary Read DMA4 Primary Write DMA4 Auxiliary Read 
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W 
19 18 17 16 15 14 13 12 11 10 9 8 


DMAS3 Primary Write DMAS3 Auxiliary Read DMA2 Primary Write DMA2 Auxiliary Read 


R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W 


7 6 5 4 3 2 1 0 
DMA1 Primary Write DMA1 Auxiliary Read DMAO Primary Write DMAO Auxiliary Read 
R/W R/W R/W R/W R/W R/W R/W R/W 


R = Read W = Write 
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Table 11-12. DMA Channels 0 and 1 (DMA0O and DMA1) Split-Mode Synchronization 


Interrupts 
Bit Value Interrupt Enabled at DMAO or DMA1 
(in DMAO pMmAo DMAO DMA1 DMA1 
or Auxiliary Primary Auxiliary Primary 
DMA1) Read Write Read Write Interrupt Source for DMA Synchronization 
0 ot None None None None -- 


01 ICRDYO OCRDYO  ICRDY1 OCRDY1 From communication port 
10 IIOFO OF 1 IOF2 IIOF3 From external pins IIOFO-IIOF3 
11 TIMO TIMO TIMO TIMO From timer TIMO 


tT DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used. 


Table 11-13. DMA Channels 2 to 5 (DMA2 to DMAS) Split-Mode Synchronization 
Interrupts 


Interrupt Enabled at DMA2-DMA5t 


Bit Value DMAx Auxiliary TDMAx Primary Interrupt Source for DMA 
(in DMA2 toDMA5) ~—Readt Writet Synchronization 
00 0+ None None -- 
001 ICRDYxt OCRDYxt From communication port 
010 IIOFO IIOFO 
011 IIOF1 IIOF1 
_ an From external pins IIOFO-IIOF3 
100 IlOF2 IlOF2 
101 IIOF3 IIOF3 
110 TIMO TIMO 
From timers TIMO and TIM1 
111 TIM1 TIM1 


t The xin DMAx represents the DMA channel number and also the number for the corresponding ICRDY x and OCRDY x inter- 
rupts. For example, an 001 9 in both DMA2 READ and DMA5 WRITE would enable interrupts ICRDY2 and OCRDY5, respective- 
ly. All other viable bit values (0109 to 1119) are the same (as shown in the table) for DMA2 through DMAS5. 

+ DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used. 


11.10.2 Synchronization Mode Bits 


Table 11-3 and Table 11-4 describe how the bit values of the SYNC MODE 
field of the DMA channel control register determine synchronization in unified 
and split mode, respectively: 
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No synchronization (SYNC MODE = 002) 
Source synchronization 
m@ for unified mode (SYNC MODE = 019) 
m@ for split mode (SYNC MODE = 102) 
Lj Destination synchronization 
m@ for unified mode (SYNC MODE = 102) 
m@ for split mode (SYNC MODE = 019) 
[J Source and destination synchronization (SYNC MODE = 119) 


UU 


When the ’C4x DMA is in split mode, the primary channel supports write (or 
destination) synchronization transfers only, and the auxiliary channel supports 
read (or source) synchronization transfers only. In split mode, bits 6 and 7 of 
the DMA channel control register (as shown in Table 11—3) are used to control 
channel synchronization: 


_j) Bit 6 controls primary write channel synchronization (destination synchro- 
nization). 


Lj Bit 7 controls auxiliary read channel synchronization (Source synchro- 
nization). 


DMA transfer rate in synchronization mode is explained in subsection 11.11.2, 
DMA Transfer Rate in Synchronization Mode, on page 11-55. 


No Synchronization 


When SYNC MODE = 002, no synchronization is performed. The DMA per- 
forms reads and writes whenever it has the priority to use the DMA bus. All in- 
terrupts are ignored. Note the difference between this mode and having the 
zero value in the DIE read or write fields. Having zeros in the DIE register read/ 
write fields results in a total DMA halt if synchronization is used, whereas 
SYNC MODE = 009 leaves the DMA channel running freely. Figure 11-27 
shows the mechanism used when SYNC MODE = 009. 


Figure 11-27.No DMA Synchronization 


(SYNC MODE = 009) 
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Source Synchronization 


When SYNC MODE = 019 (for unified mode) or wnen SYNC MODE = 10o (for 
auxiliary channel in split mode), the DMA coprocessor is synchronized to the 
source (see Figure 11-28). A read will not be performed until an interrupt is 
received by the DMA channel. Then, all DMA interrupts are disabled globally. 
However, no bits in the DMA interrupt enable register are changed. 


Figure 11-28. DMA Source Synchronization 
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feb) 


) DMA Channel in Unified Mode b) Auxiliary Channel in Split Mode 
Sync on Read 


(SYNC MODE = 019) (SYNC MODE = 102) 


Destination Synchronization 


When SYNC MODE = 105 (for unified mode) or when SYNC MODE = 019 (for 
primary channel in split mode), the DMA channel is synchronized to the des- 
tination. A write is not performed until an interrupt is received by the DMA chan- 
nel. Figure 11-29 shows the synchronization mechanism. 


In unified mode, the read is performed without waiting for the interrupt. Howev- 
er, in split mode, the read occurs only when the interrupt enabling the write is 
received. This avoids a lock situation that could happen if the primary channel 
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reads but never writes out of the temporary register, because it does not re- 
ceive the write interrupt. In this case, the auxiliary channel could not proceed, 
because the DMA internal temporary register is busy. 


Figure 11-29. DMA Destination Synchronization 


a) DMAChannel in Unified Mode b) Primary Channel in Split Mode 
Sync on Write 


(SYNC MODE = 109) (SYNC MODE = 019) 


Start Start 


DMA channel performs a read Idle until enabled interrupt is received 
Idle until enabled interrupt is received Disable interrupts globally 
Disable DMA interrupts globally 


DMA channel performs a read 


DMA channel performs a write Write data to communication port output FIFO 


Enable DMA interrupts globally Enable DMA interrupts globally 


Go to start 


Go to start 


Source and Destination Synchronization 


When SYNC MODE = 119, a read is performed when a read interrupt is re- 
ceived, and a write is performed on the write interrupt. If a write interrupt is re- 
ceived before a read interrupt, the write interrupt is latched, and the DMA data 
write is not executed until the read is completed. Unified mode source and des- 
tination synchronization (SYNC MODE = 119) is shown in Figure 11—30. 


If DMA split mode is selected, it reacts as two independent synchronizations 
for the primary (write synchronization) and auxiliary (read synchronization) 
channels. Figure 11—28b and Figure 11—29b show this. 


When the same interrupt is selected for read and write synchronization (in ei- 
ther split or unified mode), one single interrupt will enable both read and write 
operations. 
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Figure 11-30. Unified Mode DMA Source and Destination Synchronization 


(SYNC MODE = 119) 


Start 


Idle until enabled interrupt is received 


Disable DMA interrupts globally 


DMA channel performs a read 


Enable DMA interrupts globally 


Idle until enabled interrupt is received 


Disable DMA interrupts globally 


DMA channel performs a write 


Enable DMA interrupts globally 


Go to start 
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11.11 DMA Memory Transfer Timing 


The ’C4x provides six DMA channels (twelve DMA channels if they are all in 
split mode) with a fixed/rotating priority arbitration scheme and configurable 
CPU/DMA priority scheme (for detailed information, see Section 11.6 for DMA 
internal priority schemes and Section 11.7, CPU and DMA Coprocessor Ar- 
bitration, for CPU and DMA priority arbitration). 


The maximum data transfer rate that the ’C4x DMA sustains is one word every 
two cycles. The six DMA channels transfer data in a sequential time-slice fash- 
ion, rather than simultaneously, because they share common buses. 


DMA memory transfer timing can be very complicated, especially if bus re- 
source conflicts occur. However, some rules help you calculate the transfer 
timing for certain DMA setups. For simplification, the following subsection fo- 
cuses on a single-channel DMA memory transfer timing with no conflict with 
the CPU or other DMA channels. You can obtain the actual DMA transfer tim- 
ing by combining the calculations for single-channel DMA transfer timing with 
those for bus resource conflict situations. 


11.11.1 Single DMA Memory Transfer Timing 


When the DMA memory transfer has no conflict with the CPU or any other 
DMA channels, the number of cycles of a DMA transfer depends on whether 
the source and destination location are designated as on-chip memory, pe- 
ripheral, or external ports. When the external port is used, the DMA transfer 
speed is affected by two factors: the external bus wait state and the read/write 
conflict (for example, if a write is followed by a read, the read takes two cycles). 
Figure 11-31 through Figure 11-33 show the number of cycles a DMA transfer 
requires from different sources to different destinations. Entries in the table 
represents the number of cycles required to do the 7 transfers, assuming that 
there are no pipeline conflicts. A timing diagram for the DMA transfers accom- 
panies each figure. 


The DMA Coprocessor 11-51 


DMA Memory Transfer Timing 


Figure 11-31. Timing and Number of Cycles for DMA Transfers to On-Chip Destination 
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Legend: 

T = Number of transfers 

Cr = Source-read wait states 
R = Single-cycle reads 

WwW = Single-cycle writes 

RR_ =Multicycle reads 
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Figure 11-32. Timing and Number of Cycles for DMA Transfers to a Local-Bus Destination 
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Legend: 

Ec = Number of transfers 

Cr = Source-read wait states 

Cw = Destination-write wait states 


R = Single-cycle reads 
RR_ =Multicycle reads 
WW _ =Multicycle writes 
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Figure 11-33. Timing and Number of Cycles for DMA Transfers to a Global-Bus Destination 
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Legend: 

T = Number of transfers 

Cr = Source-read wait states 

Cw = Destination-write wait states 
R = Single-cycle reads 

RR_~ =Multicycle reads 

WW =Multicycle writes 


Externally, on the global and local buses, writes take at least two cycles. How- 
ever, internally, the CPU/DMA requires one cycle to perform the write to exter- 
nal memory. Therefore, the DMA/CPU can transfer data on the next cycle if it 
is not to the same external bus. For example, the DMA transfers 1024 words 
from internal memory RAM block 1 to a 1-wait-state memory on the global bus 
while the CPU runs from memory on the local bus and fetches operands from 
RAM block 0. The DMA transfer time is calculated from Figure 11-32 as 
1 + (2+1)1024 = 1 + 3072 = 3073 cycles. 
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11.11.2 DMA Transfer Rate in Synchronization Mode 


The synchronization mode used for transfers also affects the DMA data trans- 
fer rate. The DMA data transfer rate is slower if synchronization is used be- 
cause it takes two cycles to reset the request from the interrupt. However, 
these two extra cycles can be absorbed if multiple DMAs are running at the 
same time. 


In unified mode, the maximum transfer rate is one word every three cycles, us- 
ing synchronization. Figure 11-34 shows the number of cycles a DMA transfer 
requires under unified mode with different types of synchronization. For simpli- 
fication, a single-channel DMA memory transfer timing with no conflict with 
CPU or other DMA channels, no memory wait states, and interrupts always 
active, is considered. 


Figure 11—34. Unified-Mode DMA Timing for Different Synchronizations 
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Legend: 

T = Number of transfers 

R = Single-cycle reads 

WwW = Single-cycle writes 

Rr = Read flag-reset (2 cycles) 


Wr = = Write-flag reset (2 cycles) 
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In split mode, the maximum transfer rate for either the primary or auxiliary 
channel is one word every four cycles, using synchronization. When auxiliary 
and primary channels are running at the same time, the two-cycle overhead 
for interrupt reset is absorbed, and the maximum transfer rate can be one word 
every two cycles. Figure 11-35 shows the number of cycles a DMA transfer 
requires in split mode with different types of synchronization. For simplifica- 
tion, a single-channel DMA memory transfer timing with no conflict with CPU 
or other DMA channels, no wait states, and interrupts always active, is consid- 
ered. 


Figure 11-35. Split-Mode DMA Timing for Different Synchronizations 
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T. = Number of transfers 

R = Single-cycle reads primary channel 
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WwW = Single-cycle writes primary channel 

Ww’ = Single-cycle writes auxiliary channel 
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Ar = Auxiliary channel flag reset (2 cycles) 
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Communication Ports 


The ’C4x offers six (’C40) or four ((C44) on-chip communication ports for inter- 
facing with other ’C4xs and peripherals. One important feature of the ports is 
that they can work with the DMA coprocessor to transfer data without CPU in- 


tervention, allowing the CPU to perform other tasks. 


This chapter describes the key features, memory map and registers, and op- 
erations of the communication ports of the ’C4x digital signal processor. 
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12.1 Features 
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Each ’C4x communication port has several key features: 


L 


a 


a 


160-MB per second bidirectional peak data transfer rates (at 40-ns cycle 
time) 


Simple processor-to-processor communication via eight data lines and 
four control lines 


FIFO buffering of all data transfers 


Automatic arbitration and handshaking to ensure communication syn- 
chronization 


Synchronization between the CPU or direct-memory access (DMA) 
coprocessor and the six communication ports via internal interrupts and 
internal ready signals 


Support of a wide variety of multiprocessor architectures, including rings, 
trees, hypercubes, bidirectional pipelines, two-dimensional Euclidean 
grids, hexagonal grids, and three-dimensional grids 


Communication-port software reset (’C40 revisions > 5.0 and ’C44 only) 


Operational Overview 


12.2 Operational Overview 


The ’C4x contains six (C40) or four (’C44) identical high-speed communica- 
tion ports, each of which provides a bidirectional communication interface to 
one other ’C4x or external peripheral. Figure 12—1 shows the internal architec- 
ture of a single communication port. Each port contains the following compo- 
nents: 


a) 


Input FIFO channel — provides an 8-level, 32-bit wide first-in-first-out 
(FIFO) input buffer that isolates the ’C4x from the port communication data 
bus and buffers data received from an external device via the bus. 


Output FIFO channel — provides an 8-level, 32-bit wide FIFO output 
buffer that isolates the ’C4x from the port communication data bus and 
buffers data to be sent to an external device via the bus. 


Port arbitration unit (PAU) — handles the arbitration tasks associated 
with the movement of data between a ’C4x and an external device via the 
port communication data bus. The PAU is described in detail in Section 
12.4, Port Arbitration Units (PAUs), on page 12-11. 


Communication port control register (CPCR) — allows you to control 
the communication port functions and data transfer operations between 
a ’C4x and an external device via the communication port data bus. 


Communication-port software reset register ('C44 and ’C40 rev > 5.0) — 
allows you to flush the input FIFO and output FIFO levels of acommunica- 
tion port. This is explained in subsection 12.3.4, Communication Port Soft- 
ware Reset Register, on page 12-10. 


A communication port transmits each of the 32-bit words stored in its output 
FIFO on a byte-to-byte basis. Because the control and data lines are bidirec- 
tional, each ’C4x must have ownership of the communication port data bus be- 
fore starting a word transfer. A simulated tokenis used to designate bus owner- 
ship: the communication port that has the token owns the communication port 
data bus and can transmit data. 
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Figure 12-1. Communication Port Block Diagram 
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Figure 12-2. ’C4x Communication-Port Interface-Connection Example 


CREG1 |< p> CREQ4 
CACK1 | p— CACK4 
Processor GSTRBi I< »_|CSTRB4 _— Processor 
: CRDY1 |-« »—| CRDY4 , 
C1D(7-0) | sf p—| C1D(7-0) 


Figure 12—2 is an example of two ’C4x DSPs connected via their communica- 
tion ports. This simple communication interface consists of the following 
bidirectional control and data lines: 


Ly CREQx—communication-port token request. A ’C4x activates this signal 
to request the use of the communication-port data bus. 


Ly CACKx— communication-port token acknowledge. A ’C4x activates this 
signal to relinquish ownership of the communication-port data bus upon 
receiving a CREQx from another ’C4x. 


(7 CSTRBx—communication-port strobe. A sending ’C4x activates this sig- 
nal to indicate that it has placed a valid data byte on the communication 
port data bus. 


(7 CRDYx— communication-port ready. A receiving ’C4x activates this sig- 
nal to indicate that it has received a data byte via the communication port 
data bus. 


(j CxD(7-0) — communication-port data bus. This bus carries data bidirec- 
tionally, one byte at a time, between two ’C4xs or between a ’C4x and 
some other device. 


12.2.1 Token Transfer Operation 


To transfer a token, the PAUs in the two ’C4xs cooperate to generate the sig- 
nals and control sequences necessary to ensure orderly data transfers at the 
highest possible rate. To avoid conflicts on the bus, the PAUs arbitrate bus 
ownership, allowing only one DSP to transmit at any given time. The PAU that 
owns the token can relinquish bus ownership when the other 'C4x has data to 
send. 
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The signals CREQx and CACKx handle the handshaking arbitration between 
the two DSPs in two steps: 


1) The PAU that does not own the data bus (CxD(7-0)) activates CREQx to 
request bus ownership. 


2) The PAU owning the bus then activates CACKx to acknowledge the re- 
quest and relinquish bus ownership to the requesting PAU. 


In this manner, these signals transfer a token (or priority) from one PAU to an- 
other, and the PAU receiving the token gains ownership of the bus. See Sec- 
tion 12.7, Token Transfer Operation, for a detailed description of token trans- 
fer. 


12.2.2 Data Transfer Operation 
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A data transfer operation takes four basic steps to complete: 


1) The CPU or DMA coprocessor of the sending DSP writes a 32-bit data 
word to the output FIFO (of a communication port) via a memory-mapped 
address (listed in Figure 12-3). 


2) The communication port then places the 32-bit data word on CxD(7-0) on 
a byte-to-byte basis (LS byte first), activating CSTRBx to signal the receiv- 
ing communication port that the bus contains a valid data byte. 


3) Upon receiving each data byte, the receiving communication port acti- 
vates CRDYx to indicate that it has received the data byte. 


4) After receiving the 4 bytes of a 32-bit word, the CPU or DMA coprocessor 
of the receiving DSP can then read the data from the input FIFO via a 
memory-mapped address (listed in Figure 12-3). 


Each of the input and output FIFOs can buffer a maximum of eight 32-bit 
words. 


Buffering provided by the input and output FIFOs is essential. This buffering 
allows for a high degree of decoupling of computation and communication 
overhead. When ’C4xs A and B are connected via their communication ports, 
the effective length of the FIFOs becomes 16 levels. This occurs because the 
output path from A to B is the concatenation of the eight levels of the output 
FIFO of A with the eight levels of the input FIFO of B. This also applies for the 
output path from B to A. 
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12.3 Memory Map and Registers 


Figure 12-3 shows the memory map for the 'C4x communication-port control 
registers (CPCRs) and their associated input FIFOs and output FIFOs. The 
lowest three addresses of each port’s 16-address block are mapped to a corre- 
sponding CPCR, andits associated input and output FIFOs. Fields (bits) within 
a CPCR are shown in Figure 12-4. 


Figure 12-3. Communication-Port Memory Map 


0010 0040h CPCR 0 (’C40 only) 
0010 0041h input port 0, FIFO position 0 
0010 0042h output port 0, FIFO position 7 
0010 0043h Port 0 software resett 
0010 0050h CPCR 1 

0010 0051h input port 1, FIFO position 0 
0010 0052h output port 1, FIFO position 7 
0010 0053h Port 1 software resett 
0010 0060h CPCR2 

0010 0061h input port 2, FIFO position 0 
0010 0062h output port 2, FIFO position 7 
0010 0063h Port 2 software resett 
0010 0070h CPCR 3 ('C40 only) 
0010 0071h input port 3, FIFO position 0 
0010 0072h output port 3, FIFO position 7 
0010 0073h Port 3 software resett 
0010 0080h CPCR 4 

0010 0081h input port 4, FIFO position 0 
0010 0082h output port 4, FIFO position 7 
0010 0083h Port 4 software resett 
0010 0090h CPCR 5 

0010 0091h input port 5, FIFO position 0 


0010 0092h output port 5, FIFO position 7 
0010 0093h Port 5 software resett 


t This feature is only available on the ’C44 and on the ’C40 (revision 5.0 and above). 
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12.3.1 Communication-Port Control Register (CPCR) 


Figure 12—4 shows the format of a ’C4x CPCR, which contains control and 
status bits for its associated communication port. The text following the figure 
lists the CPCR bits and fields and describes their functions. 


Figure 12-4. Communication-Port Control Register (CPCR) 


31 30 29 28 27 2 25 24 23 22 21 20 19 18 17 16 15 14 13 


12 11 «10 9 8 


7 6 5 4 3 2 1 0 

PORT 

INPUT LEVEL OUTPUT LEVEL ae 
R RR R R R R R_- R/W R/W R 


Notes: 1) xx =reserved bit (read/write as zero). 
2) R=read, W = write. 


Reserved Undefined 


PORT DIR Port Direction. This bit determines the direction of data transfer operations 
for the communication port. 


PORT DIR = 0: port is in the output mode. 
PORT DIR = 1: port is in the input mode. 


This is a read-only bit. It is not possible to change the port direction under 
software control. 


ICH Input Channel Halt. 
Write a 1 to ICH to halt the input channel. 


Clear ICH to 0 when the input channel is to be unhalted. 
The input channel cannot signal externally when it is ready to receive. 


OCH Output Channel Halt. 
Write a 1 to this bit to halt the output channel immediately. 


However, the communication port is still able to accept a token request from 
the input channel. 


Clear this bit to 0 to allow the output channel to transfer data. 
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OUTPUT Output FIFO Level. Contents of this 4-bit field: 

LEVEL 0000> (0): indicates an empty output FIFO. 
00015 (1) through 01115 (7): indicates the number of full positions in the 
output FIFO. 


11115 (15): indicates a full output FIFO. 


An empty output buffer (OUTPUT LEVEL = 00005) sends an unlatched, 
positive level-triggered interrupt (OCEMPTY = 1) to the CPU. When the CPU 
or DMA coprocessor writes to the empty output FIFO, OCEMPTY is cleared 
to 0 and remains in that state until the buffer is again empty. An output FIFO 
with one or more empty levels also sends an unlatched, positive level-trig- 
gered interrupt (OCRDY = 1) to the CPU and the DMA coprocessor. This 
condition causes a READY/NOT READY signal to be generated when the 
CPU or DMA coprocessor attempts to write to the output FIFO. See Section 
12.6, Coordinating Communication Ports With the CPU and DMA Coproces- 
sor, on page 12-17, for details. 


INPUT Input FIFO level. Contents of this 4-bit field: 

LEVEL 00005 (0): indicates an empty input FIFO. 
0001 (1) through 01119 (7): indicates the number of full positions in the input 
FIFO. 


11115 (15): indicates a full input FIFO. 


A full input FIFO (INPUT LEVEL = 11112) sends an unlatched, positive level- 
triggered interrupt (ICFULL = 1) to the CPU. When the CPU or DMA 
coprocessor reads from the full input FIFO, ICFULL is cleared to 0 and re- 
mains in that state until the FIFO is again full. An input FIFO with one or more 
full levels also an unlatched, positive level-triggered interrupt (ICRDY = 1) 
to the CPU and the DMA coprocessor. This condition causes a READY/NOT 
READY signal to be generated when the CPU or DMA coprocessor attempts 
to read from the input FIFO. 


Reserved Undefined. 


12.3.2 Input-Port Register 


This read-only register contains the contents of position 0, the oldest value of 
the input FIFO. If this register is written to, its contents remain unchanged. 
Reading from an empty input FIFO causes the CPU or DMA operation to stall 
and to halt the peripheral bus. 


12.3.3 Output-Port Register 


This write-only register interfaces to position 7 (the newest value) of the output 
FIFO. If this register is read, its contents remain unchanged, and the value 
read is undefined. 
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If an output FIFO that is full is written to, the peripheral-bus interface latches 
the word, and returns a not ready signal. This condition disappears when an 
empty position appears in the output FIFO and the data on the bus is trans- 
ferred to the FIFO. 


12.3.4 Communication-Port Software Reset Register 


The input and output FIFO levels for a communication port can be flushed by 
writing at least two back-to-back values to its communication-port-software re- 
set address as specified in Table 12-1. The communication port reset feature 
does not affect the status of the external pins. 


Table 12—1.Communication-Port Software Reset Address ('C44 and ’'C40 > 5.0) 


COMMUNICATION PORT SOFTWARE RESET ADDRESS 
ot 0x0100043 
1 0x0100053 
2 0x0100063 
3t 0x0100073 
4 0x0100083 
5 0x0100093 


Tt These ports are available only in the C40. 


Example 12-1 shows a method for resetting a communication port. 


Example 12-2. Communication Port Reset 


, 
RESET1:Flushes FIFOs data for communication port 1; 


, 
, 
Z 
RESET1 push 
push 
push 
ldhi 
or 
FLUSH: rpts 
sti 
rpts 
nop 
ldi 
and 
bnz 
pop 
pop 
pop 
rets 


010h, ARO 
050h, ARO 
1 


RO, *+ARO (3) 
10 


*+AR0(0),RO 
01FE0h, RO 
FLUSH 


Se ee) 


, 
Save registers 


Set ARO to base address of COM 1 
Flush FIFO data with back-to-back write 
Wait 


Check for new data from other port 


Restore registers 


Return 
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12.4 Port Arbitration Units (PAUs) 


The PAU arbitrates between two devices to determine which device has pos- 
session of the communication port data bus at any given time. This arbitration 
uses CREQ and CACK signals to pass the bus ownership token back and forth 
between two devices connected via their communication ports. Token transfer 
operation is covered in detail in Section 12.7, Token Transfer Operation. 


After system reset, half of the communication channels associated with a par- 
ticular ’C4x have token ownership (communication ports 0, 1, 2), and the other 
half (communication ports 3, 4, 5) do not. 


The PAU is a synchronous state machine with four states, as shown in 
Table 12-2. These states are not software-accessible by the CPU or the DMA 


coprocessor. 


Table 12-2. PAU State Definitions 


PAU State 
State 0: 


Idle with token 


State 1: 


Idle without 
token 


State 2: 


Active 


State 3: 


Waiting for 
token 


Summary 


1.PAU has token (PORT DIR = 0). 
2.Channel not in use. 


1.PAU does not have token 
(PORT DIR = 1). 

2.Token not requested by PAU 
(OUTPUT LEVEL = 0). 


1.PAU has token (PORT DIR=0). 


2.Channel is in use (OUTPUT 
LEVEL # 0). 


1.PAU does not have token 
(PORT DIR = 1). 

2.Token requested by PAU 
(OUTPUT LEVEL # 0). 


PAU Status 


The PAU currently has possession of the bus own- 
ership token, and its associated communication 
channel is not in use. Under this condition, the 
PORT DIR bit of the associated CPCR is 0 
(output). This is the state of communication ports 
0, 1, and 2 after system reset. 


The PAU currently does not have possession of 
the bus ownership token and has not requested 
the token. Under this condition, the PORT DIR bit 
equals 1 (input), and the OUTPUT LEVEL field 
equals 0 (empty output FIFO). This is the state of 
communication ports 3, 4, and 5 after system re- 
set. 


The PAU currently has possession of the bus own- 
ership token, and its associated communication 
channel is in use. Under this condition, the PORT 
DIR bit equals 0 (output), and the OUTPUT LEVEL 
field does not equal 0). 


The PAU currently does not have the bus owner- 
ship token but has requested it. Under this condi- 
tion, the PORT DIR bit equals 1 (input), and the 
OUTPUT LEVEL field does not equal 0. 
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Figure 12—5 shows the state diagram and controlling equations for state tran- 
sitions. 


To place data on the communication port data bus, the PAU must arbitrate be- 
tween two types of requests: 


[1 On-chip requests to output data in the output FIFO (shown as BUSRQ = 
1 in Figure 12-5), and 


Lj) External requests received via the CREQ line (shown as TOKRQ = 1 in 
Figure 12-5). 


Figure 12-5. Communication-Port Arbitration-Unit State Diagram 


(Finished a one-word transfer) 


BUSRQ = 0; BUSRQ = 0 
TOKRQ = 0 


(Other PAU requests token;_token 
released and passed using CACK) 
TOKRQ = 1 


BUSRQ = 1 


(Transmit a word) (Bus being used) 


BUSRQ = 1 


Owns token 


Does not own token 


BUSAK = 1 
(Token received 


@ from other PAU 


over CACK) 
BUSRQ = 0 BUSAK = 0 
BUSRQ = 1 
(Request token from other PAU using CREQ) 


To further examine the port arbitration scheme represented in Figure 12-5, 
consider a data transfer operation from ’C4x A to ’C4x B. The transfer begins 
with PAU A in state 0 (idle with token) and PAU B in state 1 (idle without token). 
If PAU A receives a request (BUSRQ = 1) from its output buffer to use the com- 
munication-port data bus, it allows the output buffer to transmit one word im- 
mediately and enter state 2 (active). After the output buffer transmits one word, 
it removes the bus request (BUSRQ = 0), and PAU A returns to state 0 (idle 
with token). 
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If PAU B receives a request from its output buffer to use the bus, it activates 
CREQ to request the token from PAU A. PAU A detects this request via the 
state variable TOKRQ=1 and then activates the CACK line to transfer the bus 
ownership token to PAU B. PAU B then generates an internal bus acknowledge 
(BUSAK = 1) to indicate that it has gained bus ownership. As a result of this 
token transfer operation, PAU A enters state 1 (idle without token), and PAU 
B starts the word transfer and enters state 2 (active). 


To prevent any communication port from monopolizing the communication- 
port bus, the PAU always returns to state 0 (idle with token) and checks for a 
token request (CREQ active) from the external device after each word transfer. 
If the token request is active, the token is passed to the requesting device so 
that it can transmit a word. As long as ’C4x A and ’C4x B have information to 
send in their output FIFOs, they alternate use of the data bus to provide a bi- 
directional data path. 


If a token request is received at the end of a word transfer and the sender ’C4x 
has another word in the output FIFO to send, two situations can occur: 


Lj Ifthe CREQ going low signal is received before CRDY low is received for 
the last byte, the sender ’C4x releases the token at the end of the current 
word transfer. 


1 Ifthe CREQ going low signal is received after or at the same time as CRDY 
goes low from the last byte, the sender ’C4x continues owning the token; 
only after transferring the next word, will it release token ownership. 


In summary, token transfer occurs only on word boundaries. The ’C4x will not 
release the token until the transfer of the four bytes completes. 
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12.5 Halting of Input and Output FIFOs 


The ’C4x can halt the input FIFO, or the output FIFO, or both at word bound- 


aries. 


To halt an input FIFO, write a 1 to bit 3 (ICH) of the communication port control 
register (CPCR). This bit also can be read to determine if the port is halted or 
is able to receive. Write a 0 to the ICH bit to unhalt the input FIFO. 


To halt an output FIFO, write a 1 to bit 4 (OCH) of the communication port con- 
trol register (CPCR). This bit also can be read to determine whether the port 
is halted or is able to transmit. Write a 0 to the OCH bit to unhalt the output 
FIFO. The halt/unhalt operations are discussed in the following subsections. 
A summary is provided in Table 12-3. 


Table 12-3.Summary of Input and Output FIFO Halting 


Halted/Unhalted 


Input halted 
Output unhalted 


Input unhalted 
Output halted 


Input halted 
Output halted 
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If the Port Has Token 


a. Will not release token 


b. Will transmit data 


a. Will not transmit data 


b. If halted after the first byte is 
sent, it completes the word transfer 
and then halts the output. 


c. Will release token 


a. Will not release token 


b. Will not transmit data 


c. If halted after the first byte is 
sent, it completes the word transfer 
and then halts the output 


If the Port Does Not Have Token 


a. If the halt signal is present when the input 
FIFO finishes receiving a word, the port will not 
signal ready when the first byte of a new word is 
received (transfer frozen). If the halt signal is 
received with no word reception in progress, the 
port receives one word and then halts. 


b. If halted after the first byte is received, the 
port receives the rest of the word and then halts 
the input. 


a. Will receive data 


b. Will not request token 


a. If the halt signal is present when the input 
FIFO finishes receiving a word, the port will not 
signal ready when the first byte of a new word 
is received (transfer frozen). If the halt signal is 
received with no word reception in progress, 
the port receives one word and then halts. 


b. If halted after the first byte is received, if the 
port receives the rest of the word and then halts 
the input. 


c. Will not request token 


Halting of Input and Output FIFOs 


12.5.1 Input FIFO Halt Operation 


The goal of input FIFO halting is to halt the input FIFO as soon as possible, 
without losing the data being input. 


Acommunication port with an input FIFO that is either halted or is full does not 
respond to CSTRB low with CRDY low or acknowledge a token request with 
CACK low when CREQ low is received. This assures that the communication 
port’s output channel remains open. 


The communication-port logic checks whether an input FIFO halt signal has 
been written to the CPCR register only after finishing receiving a word. This 
implies: 


(j_ If the communication port receives an input halt signal when there is no 
word reception in progress, the input FIFO does not halt immediately; it 
waits to receive one word and then halts. This is the case of an input FIFO 
halt after reset. 


1 ‘If the halt signal is written to the CPCR register while a word is being re- 
ceived, the input FIFO receives the rest of the current word and then halts 
the input. At this point, the data transfer is frozen until the input FIFO is un- 
halted or a system reset occurs. If the input FIFO is unhalted later, the 
transfer continues without any loss of data. 


Notice that even when an input FIFO is halted, you can still read the words pre- 
viously stored in the input FIFO. 


If acommunication port’s input FIFO is halted during a token request from the 
communication port to which it is connected, then the token request is ac- 
knowledged before the input FIFO halts. 


12.5.2 Output FIFO Halt Operation 


Output FIFO halting is analogous to input FIFO halting and occurs also at word 
boundaries. Assume that ’C4x A’s output FIFO has OCH = 1. Then the output 
FIFO will be halted on the basis of its current state. 


If communication port A does not have the token: 


1 The output FIFO is halted immediately, and no request is made for the to- 
ken. 


[1 Ifthe communication port requesting the token is halted after sending the 
CREQ signal low, the communication port still accepts the token and halts 
immediately after that. 
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If communication port A has the token: 


.) If itis currently transmitting a word, then after the current word is trans- 
mitted, the output FIFO is halted and no new transfers occur. 


(1 Ifitis not currently transmitting a word, then the output FIFO halts immedi- 
ately and no transfer occurs. 


[1 Ifthe input FIFO is not halted and the output FIFO is halted, then commu- 
nication port A transfers the token when requested by communication port 
B. 


(1 ‘Ifthe input FIFO is halted and the output FIFO is halted, then communica- 
tion port A does not transfer the token when requested by communication 
port B. 


If the communication port still has the token when it comes out of the halted 
state, it can transmit data if necessary. If it needs the token, it will arbitrate for 
the token as usual. 


In summary, a halted output FIFO does not transmit but releases the token if 
the input FIFO is not halted. 


Coordinating Communication Ports With the CPU and DMA Coprocessor 


12.6 Coordinating Communication Ports With the CPU and DMA Coprocessor 
The communication ports support synchronization with two types of signals: 


.) Aready/not ready signal that can halt CPU and DMA accesses to a com- 
munication port 


[1 Interrupts that can be used to signal the CPU and DMA 


The simplest form of synchronization is based on a ready/not-ready signal. If 
the DMA or CPU attempt to read an empty input FIFO or write to a full output 
FIFO, a not-ready signal is returned, and the DMA or CPU continues to read 
or write (halting the peripheral bus) until a ready signal is received. The ready 
signal for the output channel is OCRDY (output channel ready), which is also 
an interrupt signal. The ready signal for the input channel is ICRDY (input 
channel ready), which is also an interrupt signal. 


In the interrupt form of synchronization, each communication port generates 
four different interrupt signals, as listed below (interrupt vector locations for 
these are shown in Figure 7—2): 


[J ICFULL (input channel full): indicates that the input FIFO has eight words. 


[J ICRDY (input channel ready): indicates that at least one word is in the input 
FIFO. 


[1 OCRDY (output channel ready): indicates that at least one word space is 
available in the output FIFO. 


[1 OCEMPTY (output channel empty): Indicates that the output FIFO is 
empty. 


The CPU can respond to all four of these interrupt signals. The DMA coproces- 
sor can respond only to the ICRDY and OCRDY interrupt signals. Each DMA 
channel can respond only to the ICRDY and OCRDY signals coming from its 
own communication port; that is, DMA channel / can synchronize only with 
ICRDYi and OCRDYi. 


Notice that none of the four communication-port interrupt signals has flags in 
the IIF register. These four communication-port status signals (ICFULL, 
ICRDY, OCRDY, and OCEMPTY) can be obtained by checking the input and 
output levels in the communication port control register (CPCR) with logical 
instructions. For example, to poll for an ICFULL condition, bit 12 can be tested 
for a bit value equal to 1. See subsection 12.3.1, Communication-Port Control 
Register (CPCR), on page 12-8, for more information about checking for 
communication-port conditions. 
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Maximum Communication Port Sustained Transfer Rate. The maximum 
data transfer rate of any single communication port in a 50-MHz ’C4x is 20 M 
Bytes/s. This rate can be easily achieved under CPU or DMA coprocessor con- 
trol, as long as data is sent to the output FIFO at least at this rate. However, 
when multiple communication ports are transmitting simultaneously, this may 
not be the case. For example, the DMA memory-to-memory maximum transfer 
rate is 50 M bytes/s (one read-write sequence every two cycles). The DMA can 
handle up to two communication ports transmitting at their full soeed. For more 
than two communication ports, the DMA becomes the bottleneck, regardless 
of how many DMA channels are used. The CPU can perform two reads and 
two writes in two cycles by using parallel instructions, achieving a 100-M By- 
tes/s transfer rate. For more than five communication ports, the CPU becomes 
the bottleneck. 


Token Transfer Operation 


12.7 Token Transfer Operation 


Token transfer operation requires handshaking of signals through pins CREQ 
and CACK. This is illustrated in Figure 12-6. For clarity, a suffix identifies the 
signals at each processor end. For example, CREQb denotes the CREQ sig- 
nal at the processor B end. Table 12—4 lists the handshaking events. Steps in 
the table are shown by numbers in Figure 12-6. 


Notice that an overlap feature is built into CREQ, CSTRB, and CRDY when a 
token is transferred between two ’C4x communication ports. This overlap will 
cause these signals to drive high (at both ends), ensuring that neither end is 
susceptible to floating or low-noise signals. For example, in Figure 14-23, 
CSTRB is an output before CREQ goes high, and in Figure 14-24, CSTRB be- 
comes an input only after CREQ goes high. Both ’C4xs drive communication 
port lines for a period of 0.5 H1/H3, but this is not a problem, because they are 
both driving high; as a result, there is no current from one device to the other. 


For this reason, the clocks of two ’C4xs connected together must be within a 
2:1 ratio (at most, one ’C4x can be twice as fast as the other). If this guideline 
is not followed, the overlap will last too long, and the ’C4x with the faster clock 
may start driving low before the current bus master has relinquished that line. 
This will cause signal contention that could damage communication port driv- 
ers. 


There is no limit on the time period between CREQ and CACK. The ’C4x can 
perform token transfer with a slow non-’C4x device, as long as correct handa- 
shaking of CREQ and CACK is maintained and there is no signal contention. 


To avoid bus contention problems, you should understand which event trig- 
gers the switch of the direction (input-to-output or output-to-input) of each of 
the communication port bidirectional lines. This is especially important when 
you attempt to build a communication port interface to a non-’C4x device or 
when you work with very long ’C4x links. For example, the data lines and 
CSTRB should not be driven after CACK goes low. If they are, this could cause 
a bus conflict. 


An implementation of a hardware token forcer can be found in the Commu- 
nication Ports chapter of the TMS320C4x General-Purpose Applications 
User’s Guide. 
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Figure 12-6. Token Transfer Operation 
PROCESSOR A (initial token owner) 
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PROCESSOR B (token requester) 
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type 1 delay | 
CACKb 15 
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(‘ 18 
= When signal is an input (clear = when signal is an output). | 


Note: Foran explanation of Type 1 delay, see Section 12.9, Synchronizers. 
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Table 12-4. Token Transfer Sequence 


Event 
No.t 


0 
1 
2 
3 


oO ON DO Oo Ff 


= 
oO 


11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 


Description 

Initially, A has the token and is idle. 

B wants to send data and requests the token by bringing CREQb low. 

After a transmission line time delay, A sees the token request when CREQa goes low. 


After a type 1 delay from CREQa falling, A releases token ownership and acknowledges the re- 
quest by bringing CACKa low. 


After a transmission line time delay, B sees the acknowledgement from A when CACKb goes low. 
A switches CRDYa from high impedance to high after CACKa falling. 

A puts CDa(7-0) in high impedance after CACKa falling. 

B switches CSTRBb from high impedance to high after CACKb falling. 

B brings CREQb high after a type 1 delay from CACKb falling. 


After a transmission line time delay, A sees CREQa go high. 

A switches CREQa from high impedance to high after receiving a high on CREQa. 

A brings CACKa high after CREQa goes high. 

A puts CACKa in high impedance after CREQa goes high and after CACKa goes high. 

A puts CSTRBa in high impedance after CREQa goes high. 

B puts CREQhb in high impedance after CREQb goes high. 

B switches CACKb from high impedance to high after CREQb goes high. 

B puts CRDYb in high impedance after CREQb goes high. 

B switches CDb from input to output after CREQb goes high and starts driving an undefined value. 


B drives the first byte onto CDb(7-0) on H1 rising (plus analog delay) after CREQb goes high. 


B brings CSTRBb low on the second H1 rising (plus analog delay) after CREQb rising. 
After a transmission time delay, A sees the first byte on CDa(7-0). 
After a transmission time delay, A sees CSTRBa go low, signaling valid data. 


A reads the data and then brings CRDYa low. 


Event numbers correspond to numbers in Figure 12-6. 
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The C4x communication ports transfer words on a byte-to-byte basis (LS byte 
is transmitted first). Byte transfer operation requires handshaking of signals 
through pins CSTRB and CRDY. This is illustrated in Figure 12-7. For clarity, 
a suffix identifies the signals at each processor end. For example, CSTRBb 
denotes the CSTRB signal at the processor B end. Table 12—5 lists the hand- 
shaking events. Steps in the table are shown by numbers in Figure 12-7. 


Byte transmission is totally asynchronous, and the communication-port trans- 
fer rate can be higher than one byte per cycle. The exception is for the first byte. 
Notice that on the first byte, the data lines are set up in relation to an H1 syn- 
chronization (output FIFO advance). The first byte appears on a different H1 
edge, depending on the transmit mode used. If the communication port is in 
continuous transmit mode (no token exchanged), the first data byte appears 
synchronous to the H1 falling edge before CSTRB going low. That is, the data 
appears one half of one H1 cycle before CSTRB falls. If a token transfer oc- 
curs, the first byte appears synchronous to the rising edge of H1 before 
CSTRB going low. That is, data appears one H1 cycle before CSTRB falls. 


Subsequent bytes and CSTRB high become valid from the falling edge of 
CRDY. Because both of these signals are caused by the same event but have 
different internal paths, their delay values are not exactly the same but are very 
close. 


During back-to-back write cycles, a type 2 synchronizer is used between 
CRDY low to CSTRB low before byte 0 (first byte) of the next word is trans- 
mitted. Communication port synchronizers are explained in Section 12.9, Syn- 
chronizers, on page 12-26. 


Evenif the availability of data is granted, do not tie CSTRB or CRDY to ground. 
For each byte transfer, there must be a CSTRB and CRDY handshake. The 
’C4x must see the transitions in the CSTRB and CRDY signals to advance its 
internal byte counter. 


If an input buffer becomes full, it will not activate CRDY at the beginning of the 
transmission of the first byte that would overflow the buffer. This condition pre- 
vents data transfer operations until the situation is resolved. When the receiver 
reads the full input buffer, CRDY falls, and the next FIFO position is made avail- 
able. 


Notice in Figure 12—7 that after CRDYb goes low (byte 3 has been received), 
B drives an undefined value temporarily on CDb (7-0) (event 12 in 
Figure 12-7) before driving byte 0 of the new word. 


Word Transfer Operation 


Figure 12-7. Word Transfer Operation 


PROCESSOR B (sender) 
ser a ae el 


CREQb 


CACKb im = 


CDb C808 undefined X80 


undefined { 7 7 7 12 


PROCESSOR A (receiver) 


CREQa 


CACKa 


CDa undefinedX 80) BO" 


= When signal is an input (clear = when signal is an output). 


Note: BO’ = byte 0 of a new word. 
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Table 12-5.Word Transfer Sequence 


Event 
No.t 


0 
1 


oOo Aa N Oo oA FP WO PD 


=k lt 
- oOo 


12 


Description 


B owns the token and has data to transmit. 

B drives the first byte onto CDb(7-0) on H1 falling (plus analog delay)#. 

B brings CSTRBb low on H1 rising (plus analog delay)§. 

After a transmission line time delay, A sees CSTRBa go low, signaling valid data. 

A reads the data and then brings CRDYa low 

After a transmission line time delay B sees CRDYb go low, signaling data has been read. 
B brings CSTRBb high after CRDYb goes low. 

B drives the next byte on CDb(7-0) after CRDYb goes low. 

After a transmission line time delay, A sees CSTRBa go high. 

A brings CRDYa high after CSTRBa goes high. 


After a transmission line time delay, B sees CRDYb go high. 


B brings CSTRBb low after CRDYb goes high. 
Events 3 through 11 repeat twice for bytes 2 and 3 (asynchronous handshaking) 


B drives an undefined byte on CDb(7—0) after CRDY goes low. 


t Event numbers correspond to numbers in Figure 12-7. 

t If this is the first word the token is received, this transition occurs after CREQb goes high (See event 18 in Table 12-4). 

§ If this is the first word after the token is received, this transition occurs on the second H1 rising after CREQb goes high (See 
event 19 in Table 12-4). 
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CSTRB Width Restrictions 


In ’C4x device revisions lower than 3.0, the width of the CSTRB low pulse be- 
tween word boundaries should not exceed 1.0 H1/H3 at the receiving ’C4x 
end. If it does, the receiver ’C4x byte counter that has looped back to byte 0 
between word boundaries will see this low and recognize CSTRB as the next 
valid byte, effectively slipping a byte. This is not a problem unless you are 
working with very long distances or with external devices. If you are, use flip- 
flops to locally shorten the CSTRB at the receiver end while returning a valid 
CRDY width to the sender. Wide widths at the sender are not a problem. Chap- 
ter 7, Interfacing Communication Ports, the TMS320C4x General-Purpose 
Applications User’s Guide shows a circuit to shorten the CSTRB low pulse. In 
’C4x device revisions 3.0 or higher, no CSTRB width restriction exists. 


5 .- .= 2 OO — oO —_- = oa — oat 
Note: 


See Chapter 7, Interfacing Communication Ports, in the TMS320C 4x Gener- 
al-Purpose Applications User’s Guide for a detailed description of the word 
transfer operation when interfacing a ‘C4x communication port with a 


non-’C4x device. 
= ___—————————————————————————————————— eee) 
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12.9 Synchronizers 


H1/H3 synchronization is required during word transfer boundaries and during 
token transfers. Three types of synchronizers are used in the port arbitration 
unit: 


(J Type-one synchronizers cause delays that vary from 1 to 2 machine 
clock from the receiving of an input on a pin until the response on output 
pin (ignoring analog delays). An input is recognized when H1 is high; then 
it is passed through an H3-high/H1-high series of delays. The response 
occurs at the start of the following time H3 is high. 


The minimum type-one synchronizer delay of 1 machine clock will occur when 
the input changes just before H1 goes low. This delay is shown in Figure 12-8. 


The maximum type-one synchronizer delay of 2 machine clocks will occur 
when the input changes just after H1 goes low. This delay is shown in 
Figure 12-9. 


Figure 12-8. Type-One Synchronizer Minimum Delay 
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L] Type-two synchronizers cause delays that vary from 1.5 to 2.5 machine 
clock from the receiving of an input on a pin until the response on an output 
pin (ignoring analog delays). An input is recognized when H1 is high; then 
itis passed through an H3-high/H1 -high/H3-high series of delays. The re- 
sponse occurs at the start of the following time H1 is high. 


The minimum type-two synchronizer delay of 1.5 machine clocks occurs when 
the input changes just before Hi goes low. This delay is shown in 
Figure 12-10. 


Synchronizers 


The maximum type-two synchronizer delay of 2.5 machine clocks occurs 
when the input changes just after H1 goes low. This delay is shown in 
Figure 12-11. 


Figure 12-10. Type-Two Synchronizer Minimum Delay 


Input ' 


lq— 1.5 Clocks —pl 


Response | 


Figure 12-11. Type-Two Synchronizer Maximum Delay 


Input 


Response | ; ; 

\q¢—_ 2.5 Clocks —_—— >] 

(J Type-three synchronizers cause delays that vary from 0.5 to 1.5 ma- 
chine clocks from the receiving of an input on a pin until the response on 
output pin (ignoring analog delays). An input is recognized when H1 is 
high; then it is passed through an H3-high delay. The response occurs at 
the following time H1 is high. 


The minimum type-three synchronizer delay of 0.5 machine clock cycles will 
occur when the input changes just before H1 goes low. This delay is shown 
in Figure 12-12. 


The maximum type-three synchronizer delay of 1.5 machine clocks will occur 
when the input changes just after Hi goes low. This delay is shown in 
Figure 12-13. 


Figure 12-12. Type-Three Synchronizer Minimum Delay 


Response} 


\¢—_ tx 0.5 Clock 
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Figure 12-13. Type-Three Synchronizer Maximum Delay 


\@— 1.5 Clocks —- 


Response } 


Table 12-6 shows the types of synchronizer delays for communication port 
signals. 


Table 12-6.Communication-Port Signals and Synchronizer Delays 


Delay Min. Delay Max. Delay 
Input Signal to Output Signal Type (clock cycles) (clock cycles) 
CREQJ to CACKL One 1 2 
CACK to CREQT One 1 2 
CRDY\V to CD valid between back-to-back word transfers One 1 2 
CRDYJ to CSTRBJ between back-to-back word trans- Two 15 25 
fers 
CACKY to CSTRB switch from input to an output high. Three 0.5 1.5 
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12.10 Module Reset 


This section explains the status of the ’C4x communication ports after power- 
up and during and after system reset. 


The recommended reset sequence in a multiprocessing system is described 
in Chapter 1, Processor Initialization and Program Control, inthe TMS320C4x 
General-Purpose Applications User’s Guide. 


After powerup, the status depends on the RESET pin: 


If RESET is low, the ’C4x is in reset immediately, and the description under at 
reset” (below) applies. 


If RESET is not low, the ’C4x device is in an unknown stage. The communica- 
tion port signals can be in a combination of states. 


At reset (while RESET = 0), the communication port pins are all put in the 
high-impedance state. The input and output channels both assume an empty 
state, causing all values in the input and output buffers to be lost. Pullup resis- 
tors should be used on all control lines to ensure that they are logic high if reset 
is not applied at the same time in interconnected ’C4xs. 


After reset (after the rising edge of RESET), communication ports 0,1, and 
2 are configured as output ports and assume the following states: 


Lj The PAU is reset to state 0: The PAU has the bus ownership token and is 
idle. 


Lj The pin status (see Figure 12-14) is set as follows: 
m The CxD(7-0) signals start driving an undefined value. 


m The CACK and CSTRB signals go to 1 (inactive). CREQ and CRDY 
continue to be high-impedance. 


Note: 


The individual communication port software reset feature only flushes the FI- 


FOs, but does not have any effect on the communication port external pins. 
eee ee eee 
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[J The communication port control register gets a Oh value: 


m PORT DIR =0: the communication port is configured for an output op- 
eration. 


INPUT LEVEL = 0: The input FIFO is empty. 
OUTPUT LEVEL = 0: The output FIFO is empty. 


ICH = 0: The input FIFO is not in its halted state. 
m OCH =0: The output FIFO is not in its halted state. 


[1 ICRDY = 0: The input FIFO is empty and is not ready to be read from. 


[J OCRDY = 0: The input FIFO is not full and is ready to be written to. 
Figure 12-14. Post-Reset State for an Output Port 
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CRDY ——o=e oor 


CD(7-0) | Undefined 


After reset (after rising edge of RESET), communication ports 3,4, and 5 are 
configured as input ports and assume the following states: 


_j PAU is reset to state 1: The PAU does not have the bus ownership token, 
and the token is not requested. 


Lj The pin status (see Figure 12-15) is set as follows: 
m CxD(7-0) continue to be high-impedance. 


m CREQandCRDY signals go to 1 (inactive). CACK and CSTRB contin- 
ue in high impedance. 


[J The communication port control register gets a value of 04h. 
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[) PORT DIR = 1: the communication port is configured for an input operation: 
m INPUT LEVEL = 0: The input FIFO is empty. 
m OUTPUT LEVEL = 0: The output FIFO is empty. 
m ICH =0: The input FIFO is not in its halted state. 
m OCH = 0: The output FIFO is not in its halted state. 


[1 ICRDY = 0: The input FIFO is empty and is not ready to be read from. 


[J OCRDY = 0: The input FIFO is not full and is ready to be written to. 


Figure 12-15. Post-Reset State for an Input Port 
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Ai reset, ports 0, 1, and 2 are configured as output ports (PORT DIR 
= 0), and ports 3, 4, and 5 are configured as input ports (PORT DIR 
= 1). When you interconnect the ports of two ’C4x devices, connect 


the port of one ’C4x to a port of the other ’C4x that would be in the 
opposite direction at reset in other words, connect any one of port 
0, 1, or 2 connected to any one of port 3, 4, or 5. 


If your system configuration requires connection of input-to-input communica- 
tion ports or output-to-output communication ports, refer to Chapter 7, /nter- 
facing Communication Ports, in the TMS320C4x General-Purpose Applica- 
tions User’s Guide for an implementation of a token forcer. 
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12.11 Tips for Using Communication Ports 
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When you design systems that use the communication ports, there are consid- 
erations to keep in mind: 


Lj At reset, ports 0-2 are set for transmit and ports 3—5 are set for receive. 
When connecting communication ports between ’C4x devices, make sure 
you connect transmit ports to receive ports and receive ports to transmit 
ports. Otherwise, unpredictable results may occur. 


(1 Signal quality is very important. Make sure you design your board to mini- 
mize noise from other components. See the section entitled Signal Con- 
siderations in the communications port chapter of the TMS320C 4x Gener- 
al-Purpose Applications User’s Guide for more information. 


L1 Do not read from an empty input FIFO. This will cause the CPU or DMA 
operation to stall and to halt the peripheral bus. 


[j Do not write to an unconnected communication port. If a port’s transmit 
FIFO is full and the port can’t transmit, an additional write to the port’s FIFO 
will halt the peripheral bus. 


(1 The clocks of two ’C4xs connected together must be within a 2:1 ratio (at 
most, one ’C4x can be twice as fast as the other). If this guideline is not 
followed, the ’C4x with the faster clock may start driving low before the cur- 
rent bus master has relinquished that line. This will cause signal conten- 
tion that could damage communication port drivers. This restriction does 
not apply when connecting to a non-’C4x device. 


() When you design an interface to a non-’C4x device, the non-’C4x device 
should mimic the asynchronous handshaking operation of a’C4x commu- 
nication port. See Word Transfer Considerations in the TMS320C 4x Gen- 
eral-Purpose Applications User’s Guide for more information on interfac- 
ing to non-’C4x devices. 


a a TS, | 
Note: 


See Section 7.4, Design Tips, inthe TMS320C4x General-Purpose Applica- 


tions User’s Guide for more tips for using communication ports. 
es | 
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Timers 


The ’C4x has two general-purpose timer modules that time events, generate 
pulses, and interrupt the CPU or DMA coprocessor. 


This chapter provides you with information about: 


_j The components of the timers 

Lj The control registers of the timers 

_j The operation of the timers 

Lj The interrupts generated by the timers 
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13.1 Overview of the Timers 
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The ’C4x has two 32-bit general-purpose timer modules. Each timer has two 
signaling modes and can be clocked by an internal or an external source. The 
timer modules can be used to send periodic signals to the ’C4x or to devices 
in the external world; or they can be used to count external events. Each timer 
has an I/O pin (TCLK) that functions as an input clock, as an output clock, or 
as a general-purpose I/O pin. 


With an internal clock, for example, the timer can signal an external A/D con- 
verter to start a conversion, or it can interrupt the *C4x DMA controller to begin 
a data transfer. 


With an external clock, for example, the timer can count external events and 
interrupt the CPU after a specified number of events. 


Each timer consists of a 32-bit counter, a comparator, an input clock selector, 
a pulse generator, and supporting hardware. 


A timer in the ’C4x counts the cycles of a timer input clock. When that count 
(counter register) equals the value stored in the timer period register, it rolls 
over the counter to zero and produces a transition in the timer output signal. 


The timer input clock can be either the H1/2 internal clock frequency of the ’C4x 
or an external clock on the TCLKx pin. This is determined by the CLKSRC bit 
in the timer control register. If an external clock is used, the timer can counter 
either 0-to-1 or 1-to-0 transitions depending on the value of the INV bit. 


The timer output signal depends on the signalling mode selected by the C/P 
bit (clock or pulse mode). See Section 13.4, Timer Pulse Generation, on page 
13-9, for more information about this bit. 


The timer output can be routed to the TCLKx pin that can also be used as a 
general-purpose I/O pin. 


Figure 13-1 shows the block diagram of a ’C4x timer module. 


Figure 13-1. Timer Block Diagram 


Counter (32-bit) Heese 
Timer 


. Input 
Period register (31-0) eat a Clock 4 


CLKSRC bit 
Comparator 
period = counter ? 


Pulse generator 


C/P bit —>— 


INV bit > TSTAT bit 
DATAOUT 
bit Timer Output Signalt 
Output clock CLKSRC bit 
selector FUNC bit 


TCLKx pin 


t If CLKSRC =1 and FUNC = 1, this signal goes into the TCLK pin. 
+ Selector controlled by the CLKSRC bit. 
§ Maximum frequency = f(H1)/ 2.6 


Internal Clock/2 = H1/2 


Overview of the Timers 


External (TCLKx)§ 
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13.2 Timer Pins 
Each timer has one pin associated with the timer clock signal (TCLK) pin. 


(1 TCLK. This pin is used as a general-purpose I/O signal, as a timer output, 
or as an input for an external clock for a timer. Each timer has a TCLK pin: 
TCLKO is connected to timer 0 and TCLK1 is connected to timer 1. 
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13.3 Timer Control Registers 


The timers are controlled through three registers, as shown in Figure 13-2, 
that are mapped into the peripheral address space: 


Lj) Control register. This register determines the operating mode of the tim- 
er, monitors the timer status, and controls the function of the I/O pin 
(TCLKk) of the timer. 


_j Period register. This register contains the number of timer input clock 
cycles to count. This number controls the timer output signal frequency. 


_j Counter register. Contains the current value of the incrementing counter. 
The 32-bit counter counts timer input clock cycles. 


Figure 13-2. Memory-Mapped Timer Locations 


Register Peripheral Address 
Timer 0 Timer 1 
Timer Control 100020h 100030h 
Reserved 100021h 100031h 


Reserved 100022h 100032h 
Reserved 100025h 100035h 
Timer Period 100028h 100038h 


Reserved 10002Dh 10003Dh 
Reserved 10002Eh 10003Eh 


Timers 13-5 


Timer Control Registers 


13.3.1 Timer Control Register 


The timer control register is located at 100020h for timer 0 and at 100030h for 
timer 1. 


The 32-bit timer global control register contains two sets of bits: 


Lj The timer global control bits (bits 11-6) control timer mode and monitor 
timer status (TSTAT). 


[1 The TCLK pin control bits (bits 3—0) control the function of the TCLK pin, 
which can be used as a timer pin or as a general-purpose I/O pin. 


Figure 13-3 shows the 32-bit timer global control register. Note that at reset 
all bits are set to 0, except for DATIN, which is set to the value read on TCLK. 


Figure 13-3. Timer Control Register 


32 


Note: 
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R/W 
R=Read, W=Write 


FUNC 


VO 


DATOUT 


DATIN 


GO 


HLD 


1 9 5 4 3 2 1 0 


0 8 7 6 
xx TSTAT | INV | CLKSRC xx_/| DATIN 
R R 


R/W R/W R/W R R R R 
Function bit. The FUNC bit controls the function of the TCLK pin. If FUNC 
= 0, TCLK is configured as a general-purpose digital I/O pin. If FUNC = 1, 
TCLK is configured as a timer pin. 


Input/output bit. If I/O = 1 and FUNC = 0, then TCLK is configured as an 
input pin. If 1/0 = 0 and FUNC = 0, then TCLK is configured as an output pin. 


Data output bit. DATOUT drives TCLK when the ’C4x is in I/O port mode. 
DATOUT can also be used as an input to the timer. 


Data input bit. Reads data from TCLK or DATOUT. A write to this bit has no 
effect. 


GO bit. Resets and starts the timer counter. When GO = 1 and the timer is 
not held, the counter is zeroed and begins incrementing on the next rising 
edge of the timer input clock. The GO bit is cleared on the same rising edge. 
When GO = 0, the timer is not affected. Section 13.8, Configuring A Timer, 
further defines this bit. 


Counter hold bit. When HLD = 0, the counter is disabled and held in its cur- 
rent state. If the timer is driving TCLK, the state of TCLK is also held. The 
internal divide-by-two counter is also held so that the counter can continue 
where it left off when HLD is set to 1. The timer registers can be read and 
modified while the timer is being held. RESET has priority over HLD. Section 
13.8, Configuring A Timer, shows the effect of writing to GO and HLD, and 
shows the result of a write using specified values of the GO and HLD bits in 
the timer global control register. 


Timer Control Registers 


GO HLD 
(Bit6) (Bit 7) Result 
0 0 All timer operations are held. No reset is performed. 
0 1 Timer proceeds from state before write. 
, 0 All timer operations are held, including zeroing of the counter. 
The GO bit is not cleared until the timer is taken out of hold. 
1 1 Timer resets and starts. 
c/P Clock/pulse mode control. When C/P = 1, clock mode is chosen, and the 


signaling of the TSTAT status bit and TCLK pin will have a 50 percent duty 
cycle. When C/P =0, the TSTAT status bit and TCLK pin will be active for one 
H1 cycle during each timer period (see Figure 13-4). 


CLKSRC _ Timer input clock source select bit. Specifies the source of the timer input 
clock. When CLKSRC = 1, an internal clock with frequency equal to one-half 
the H1 frequency is used as the timer input clock, and the INV bit has no ef- 
fect. When CLKSRC = 0, an external signal from the TCLK pin is used as the 
timer input clock. The external clock is synchronized internally, thus allowing 
external asynchronous clock sources that do not exceed the specified maxi- 
mum allowable external clock frequency of f(H1)/2.6. 


INV Inverter control bit. If an external clock is used as the timer input clock and 
INV= 1, the external clock is inverted as it goes into the counter. If the output 
of the pulse generator (TSTAT) is routed to TCLK and INV = 1, the output is 
inverted before it goes to TCLK. If INV = 0, no inversion is performed on the 
input or output of the timer. The INV bit has no effect, regardless of its value, 
when TCLK is used in I/O port mode. 


TSTAT Timer status bit. This bit tracks the output of the timer and sets a CPU inter- 
rupt on a transition from 0 to 1. A write has no effect. 
13.3.2 Timer Period Register 


The timer period register is located at 100028h for timer 0 and at 100038h for 
timer 1. 


The 32-bit timer period register contains the number of timer input clock cycles 
to count. This number controls the frequency of the timer output signal. 


Timers 13-7 


Timer Control Registers 


The frequency of timer signaling is determined by the frequency of the timer 
input clock and the period register. The following equations are valid with either 
an internal or an external timer clock: 


f(pulse mode) = f(timer clock) + period register 
f(clock mode) = f(timer clock) + (2 x period register) 


This register is cleared to 0 at reset. 


13.3.3 Timer Counter Register 


The timer period register is located at 100024h for timer 0 and at 100034h for 
timer 1. 


The 32-bit timer counter register increments with each cycle of the timer input 
clock. The timer counter can be incremented on the rising edge (INV = 0) or 
on the falling edge (INV = 1) of an externally generated timer input clock 
(CLKSRC = 0). With an internally generated timer input clock (CLKSRC = 1), 
the timer counter increments on the rising edge only. The timer counter is 
zeroed whenever its value equals that of the period register. 


This register is cleared to 0 at reset. 


13.3.4 Boundary Conditions in the Control Registers 
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Certain boundary conditions, such as a zero in the period register and an over- 
flow of the counter, affect timer operation. These conditions are listed as 
follows: 


(J When the period and counter registers are zero, the operation of the timer 
depends on the C/P mode selected. In pulse mode (C/P = 0), TSTAT is set 
and remains set. In clock mode (C/P = 1), the width of a cycle is 2/f(H1), 
and external clocks are ignored. 


(J When the counter register is not 0 and the period register = 0, the counter 
will count until it reaches its maximum 32-bit value (OFFFF FFFFh), roll 
over to 0, and then function as described in the preceding bullet. 


(1 When the counter register is set to a value greater than the value of the 
period register, the counter reaches its maximum 32-bit value 
(OFFFF FFFFh), rolls over to 0, and continues. 


Note: 


Writes from the peripheral bus override register updates from the counter 


and new status updates to the control register. 
Cd 
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13.4 Timer Pulse Generation 


The timer pulse generator (see Figure 13—1) can generate several different 
TSTAT signals. These signals can be inverted (set by the INV bit) into the timer 
output signal. The two basic pulse generation modes are pulse mode and 
clock mode, as shown in Figure 13—4. You can select the mode with the C/P 
bit of the timer global control register. In both modes, an internal clock source 
has a frequency of f(H1) + 2, and an external clock source has a maximum fre- 
quency of f(H1) + 2.6. In pulse mode (C/P = 0), the width of the pulse is 1/f(H1). 
In clock mode (C/P=1), the width of the pulse is the period register divided by 
the frequency of the timer input clock. 


Figure 13-4. Timer Pulse Mode and Clock Mode Timing 


ELLER 


<> 1(CLKSRC)} ms 
f " Period register + f(CLKSRC): 
(counter=period) (counter=period) (counter=period) 
TINT TINT TINT 


(a) TSTAT and timer output (INV = 0) when Cc/P =0 (pulse mode) 


\¢—_————}- 1/(CLKSRC) 


icceeeeneeeeee 


i ! > Period register ~ (CLKSRC} 
—— 2x Patiod Sia rr 
(counter=period) (counter=period) (counter=period) 
TINT TINT 


(b) TSTAT and timer output (INV = 0) when C/P = 1 (clock mode) 


Note: TINT is the timer interrupt signal generated whenever TSTAT transitions from 0 
to 1. 
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The rate of the timer output (TSTAT) is determined by the frequency of the tim- 
er input clock and the period register. The following equations are valid with 
either an internal or an external timer clock: 


In pulse mode: f(TSTAT) = f(timer input clock) + period register 
In clock mode: f(TSTAT) = f(timer input clock) + (2 x period register) 


If the period register equals zero, refer to subsection 13.3.2, Timer Period Reg- 
ister. 


Figure 13-5 provides some examples of TSTAT and timer output (INV = 0) 
when the period register is set to various values and clock or pulse mode is 
selected. Timer input clock is generated internally (f(H1 + 2)). 


Figure 13-5. Timer Output Generation Examples 
> 2H1 
ea ie 


(a) Pulse mode with timer period = 1 or 
Clock mode with timer period = 0 


{*—> 4H1 
Rietiane | 
(b) Pulse mode with timer period =2 
f¢— 6H1 > 
tite: J 
(c) Pulse mode with timer period = 3 
f— 4H1 
> 2H1 fe | 
(d) Clock mode with timer period = 1 
f¢ 8H1 >| 
¢— 4H1 + | 


(e) Clock mode with timer period = 2 


(f) Clock mode with timer period = 3 
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13.5 Timer Interrupts 


Each timer can send an interrupt to the CPU when the TSTAT signal transitions 
from 0 to 1. Timer 0 sends TINTO and timer 1 sends TINT1. 


TINTO. This interrupt uses the interrupt vector at IVTP + 002h. It has a priority 
level of two, which is second only to NMI and RESET. 


TINT1. This interrupt uses the interrupt vector at IVTP + O2Bh. It has a the low- 
est priority level of all interrupts. 
13.5.1 Timer Interrupts and Their Vectors 


TINTO corresponds to timer 0. This interrupt uses the interrupt vector at IVTP 
+ 002h. It has a priority level of two, which is second only to NMI and RESET. 


TINT1 corresponds to timer 1. This interrupt uses the interrupt vector at IVTP 
+ O2Bh. It has a the lowest priority level of all interrupts. 
13.5.2 Timer Interrupt Operation 


A timer interrupt is generated whenever TSTAT transitions from a zero to a 
one. The frequency of timer interrupts depends on whether the timer is set up 
in pulse mode or clock mode. 


In pulse mode, the interrupt frequency is: 
f(interrupt)=f(input timer clock) + period register 

In clock mode, the interrupt frequency is: 
f(interrupt)=f(input timer clock) + (2 x period register) 


If the period register equals zero, see subsection 13.3.4, Boundary Conditions 
in the Control Registers, on page 13-8, for more information. 


The timer interrupt can be used to interrupt either the CPU or the DMA copro- 
cessor. 


The timer interrupt enable bits for the CPU are found in the IIE register. Bit 0 
in the IIE corresponds to TINTO, and bit 1 corresponds to TINT1. For more in- 
formation about the IIE register, see subsection 3.1.9, CPU Internal Interrupt 
Enable Register (IIE), on page 3-11. 


The timer interrupt enable bits for the DMA control register are found in the DIE 
register. Several bits in this register control how each DMA channel responds 
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to the timers. For more information about the DIE register, see subsection 
13.3.4, Boundary Conditions in the Control Registers. 


13.5.3 Considerations When Using a Timer Interrupt 


The main consideration when using a timer to interrupt the CPU is the priority 
needed for the operation. If the timer operation has a low priority compared to 
other devices, then use timer 1, since that timer’s interrupt has the lowest prior- 
ity of all interrupts. If, on the other hand, the timer operation has a high priority 
compared to other devices, then use timer 0, since that timer’s interrupt is se- 
cond in priority only to an NMI. 
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13.6 Selecting CLKSRC and FUNC Values 


The timer can receive its input and send its output in several different modes, 
depending on the setting of CLKSRC, FUNC, and I/O. The four timer modes 
of operation are defined by the values of CLKSRC and FUNC in the global con- 
trol register. 


13.6.1 CLKSRC = 1 and FUNC = 0. 


If CLKSRC = 1 and FUNC = 0 (see Figure 13-6), the timer input comes from 
the internal clock. Interrupts can still be generated during the transition of 
TSTAT from 0 to 1. The internal clock is not affected by the INV bit in the global 
control register. In this mode, TCLK is connected to the I/O port control and can 
be used as ageneral-purpose I/O pin. If /O = 0, TCLK is configured as a gener- 
al-purpose input pin whose state can be read in DATIN. DATOUT has no effect 
on TCLK or DATIN. If /O = 1, TCLKis configured as a general-purpose output 
pin. DATOUT is placed on TCLK and can be read in DATIN. 


Figure 13-6. Timer Configuration With CLKSRC=1 and FUNC=0 
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13.6.2 CLKSRC=1 and FUNC=1. 


If CLKSRC = 1 and FUNC = 1 (see Figure 13-7), the timer input comes from 
the internal clock, and the timer output goes to TCLK. You can invert the value 
on TCLK by setting INV to 1. Also, the value of TCLK can be read in DATIN. 


Figure 13-7. Timer Configuration With CLKSRC = 1 and FUNC = 1 
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13.6.3 CLKSRC = 0 and FUNC = 0 


If CLKSRC = 0 and FUNC = 0 (see Figure 13-8), the timer can still generate 
interrupt signals and is driven according to the status of the I/O bit: 


1 If 1/0 =0, the timer input comes from TCLK. You can invert the value read 
from TCLK by setting INV to 1, and the value of TCLK can be read through 
DATIN. 


Lj If /O = 1, TCLK is an output pin; both TCLK and the timer are driven by 
DATOUT. All 0-to-1 transitions of DATOUT increment the counter. INV has 
no effect on DATOUT. The value of DATOUT can be read through DATIN. 


Figure 13-8. Timer Configuration With CLKSRC = 0 and FUNC = 0 
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13.6.4 CLKSRC = 0 and FUNC = 1 


If CLKSRC = 0 and FUNC = 1 (see Figure 13-9), TCLK drives the timer. If INV 
= 0, all O-to-1 transitions of TCLK increment the counter. If INV= 1, all 1-to-0 
transitions of TCLK increment the counter. The value of TCLK can be read 
through DATIN. 


Figure 13-9. Timer Configuration With CLKSRC = 0 and FUNC = 1 
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13.7 Using TCLKx as General-Purpose I/O Pins 


When FUNC = 0, TCLKx can be used as an I/O pin. Figure 13-10 and 
Figure 13-11 show how the TCLKx is connected when it is configured as a 
general-purpose I/O pin. In Figure 13-10, the I/O bit equals 0 and TCLK is con- 
figured as an input pin whose value can be read in the DATIN bit. In 
Figure 13-11, the I/O bit equals 1 and TCLK is configured as an output pin that 
outputs the value you wrote in the DATOUT bit. 


Figure 13-10. TCLK as an Input (I/O = 0) 
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13.8 Configuring a Timer 


Configuring a timer requires three basic steps: 


1) Haltthe timer by clearing to 0 the GO and HLD bits of the timer global-con- 
trol register. To do this, write a 0 to the timer global-control register. Note 
that the timers are halted on RESET. 


2) Configure the timer via the timer global-control register (with GO=HLD=0), 
the timer counter register, and timer period register, if necessary. 


3) Start the timer by setting the GO and HLD bits of the timer global-control 
register to 1. 


Example 13-1 shows how to set up the ’C4x timer to generate the maximum 
frequency clock through the TCLKx pin. 


Example 13-1. Maximum Frequency Timer Clock Setup 
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ITLE MAXIMUM FREQUENCY TIMER CLOCK SETUP 


FREQUENCY TIMER CLOCK USING INTERNAL CLOCK. WHERE 
“TIMER_REGISTER” SECTION IS LOCATED FROM 100020h. 


* 
* 
* THIS EXAMPLE SHOWS HOW TO SET UP TIMER TO GENERATE MAXIMUM 
* 
* 
* 


IMO_CTL_REG .usect “TIMR_REGISTER”, 4 
IMO_CNT_REG .usect “TIMR_REGISTER”, 4 
TIMO_PRD_REG .usect “TIMR_REGISTER”, 8 


sCext 

LDI 0,RO 

STI RO, @TIMO_PRD_REG 
LDI 3C1H, RO 

STI RO, @TIMO_CTL_REG 


.end 
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The ’C4x assembly language instruction set supports numeric-intensive, 
signal processing, and general-purpose applications. The instructions are 
organized into these major groups: load-and-store, two- or three-operand 
arithmetic/logical, parallel, program control, and interlocked operations 
instructions. The addressing modes used with the instructions are described 
in Chapter 6. 


The ’C4x instruction set can also use one of 20 condition codes with any of the 
10 conditional instructions, such as LDFcond. This chapter defines the 
condition codes and flags. 


The assembler allows optional syntax forms to simplify the assembly language 
for special-case instructions. These optional forms are listed and explained. 


Each of the individual instructions is described and listed in alphabetical order. 
An example instruction (on pages 14-23 through 14-25) demonstrates the 
special format used and explains its content. 


This chapter discusses these topics: 


Topic Page 
TATE INSTRUCTION Se tity sesyer eter ster cvevereas tote fete tot eaeye ayaa etststede Geta atevee tele ntevataye 14-2 
14.2 Condition Codes and Flags ............2.2eeeeee eee eee eee 14-12 
14.3 Individual Instruction Descriptions ............. 0 cece eee e eee 14-16 
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14.1 Instruction Set 


The ’C4x instruction set is exceptionally well-suited to digital signal processing 
and other numeric-intensive applications. All instructions are a single machine 
word long, and most instructions take a single cycle to execute. In addition to 
multiply and accumulate instructions, the ’C4x possesses a full complement 
of general-purpose instructions. 


The instruction set contains 145 instructions organized into the following func- 
tional groups: 


Load-and-store 

Two-operand arithmetic/logical 
Three-operand arithmetic/logical 
Program control 

Interlocked operations 

Parallel operations 


UCOUUOUCU 


Each of these groups is discussed in the succeeding subsections. 


14.1.1 Load-and-Store Instructions 


The ’C4x supports 24 load-and-store instructions (see Table 14—1). These in- 
structions can: 


_j Load a word from memory into a register 

[J Store a word from a register into memory 

Lj Manipulate data on the system stack 

_j Transfer data between primary register and expansion register 


Two of these instructions can load data conditionally. This is useful for locating 
the maximum or minimum value in a data set. See Section 14.2 for detailed 
information on condition codes. 
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Table 14—1.Load-and-Store Instructions 


Instruction Description Instruction Description 

LBbt Load byte (signed) LDPKt Load DP register immediate 
LBUbt Load byte (unsigned) LHwt Load half-word signed 

LDAT Load address register LHUwt Load half-word unsigned 

LDE Load floating-point exponent LWLett Load word left-shifted 

LDEPt Aan cs al register to iwreft Load word right-shifted 

LDF Load floating-point value POP Pop integer from stack 

LDFcond Load floating-point value conditionally © POPF Pop floating-point value from stack 
LDHIt wre 6-bit unsigned immediate into 16 PUSH Push integer on stack 

LDI Load integer PUSHF Push floating-point value on stack 
LDIcond Load integer conditionally STF Store floating-point value 

LDM Load floating-point mantissa STI Store integer 

LDPEt Sead Integer Pumary FegISISE IO OXPan. eri Store integer immediate 


sion file register 


t The ’C4x instruction set is a superset of the ’C3x instruction set. The instructions marked are 'C4x-specific. 
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14.1.2 Two-Operand Instructions 


The ’C4x supports a complete set of 43 two-operand arithmetic and logical in- 
structions. The two operands are the source and destination. The source oper- 
and can be a memory word, a register, or a constant. The destination oper- 


and is always a register. 


These instructions provide integer, floating-point, or logical operations, 
and multiprecision arithmetic. Table 14—2 lists these instructions. 


Table 14-2. Two-Operand Instructions 


Instruction 
ABSF 
ABSI 


ADDct 


ADDFt 


ADDIt 
ANDt 
ANDNt 
ASHt 
CMPFt 
CMPIt 
FIX 
FLOAT 


FRIEEE+ 


LSHt 
MBctt 
MHcét 


Description 


Absolute value of a floating-point number 


Absolute value of an integer 


Add integers with carry 


Add floating-point values 


Add integers 

Bitwise logical-AND 

Bitwise logical-AND with complement 
Arithmetic shift 

Compare floating-point values 
Compare integers 

Convert floating-point value to integer 


Convert integer to floating-point value 


Convert IEEE floating-point format to 2s- 


complement floating-point format 


Logical shift 
Merge byte, left shifted 


Merge half-word, left shifted 


Instruction 


MPYFt 
MPyYIt 


MPYSHITtt 


MPYUHITt+ 


NEGB 
NEGF 
NEGI 
NORM 
NOT 
ORt 
RCPFt 
RND 


ROL 


ROLC 
ROR 
RORC 


Description 


Multiply floating-point values 
Multiply integers 


Multiply signed integer, 32-MSB 
product 


Multiply unsigned integer, 32-MSB 
product 


Negate integer with borrow 
Negate floating-point value 
Negate integer 

Normalize floating-point value 
Bitwise logical-complement 
Bitwise logical-OR 

Reciprocal floating point 


Round floating-point value 
Rotate left 


Rotate left through carry 
Rotate right 


Rotate right through carry 


Two- and three-operand versions 
+ The ’C4x instruction set is a superset of the ’C3x instruction set. The instructions marked are ’C4x-specific. 
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Table 14—2. Two-Operand Instructions (Continued) 


Instruction Description Instruction Description 


RSQRFt Reciprocal of square root, floating-point © SUBRF Supt Achevers? dleauing point 


value 

SUBBt Subtract integers with borrow SUBRI Subtract reverse integer 

SUBC Subtract integers conditionally TOIEEEt Convert 2s complement to IEEE 
format 

SUBFT Subtract floating-point values TSTBt Test bit fields 

SUBIt Subtract integer XORTt Bitwise exclusive-OR 


Subtract reverse integer with 


SUBRB 
borrow 


t Two- and three-operand versions. 
+ The ’C4x instruction set is a superset of the ’C3x instruction set. The instructions marked are 'C4x-specific. 
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14.1.3 Three-Operand Instructions 


Most instructions contain two or three operands. The 19 three-operand in- 
structions allow the ’C4x to read two operands from memory or the CPU regis- 
ter file in a single cycle and store the results in a register. The following differen- 
tiates the two- and three-operand instructions: 


[J Two-operand instructions have one source operand (or shift count) anda 
destination operand. 


Lj Three-operand instructions may have two source operands (or one 
source operand and a count operand) and a destination operand. A 
source operand can be amemory word, a register or a constant. The desti- 
nation of a three-operand instruction is always a register. 


Table 14—8 lists the instructions that have three-operand versions. Note that 
the 3 in the mnemonic can be omitted from three-operand instructions 
(see subsection 14.3.2). 


Table 14—3. Three-Operand Instructions 


Instruction Description Instruction Description 

ADDC3 Add with carry MPYI3 Multiply integers 

ADDF3 Add floating-point values MPYSHI3t Multiply signed integer, 32-MSB product 
ADDI3 Add integers MPYUHI3T Multiply unsigned integer, 32-MSB product 
AND3 Bitwise logical-AND OR3 Bitwise logical-OR 

ANDN3 eas logical AND Wi comple | ccgEg Subtract integers with borrow 

ASH3 Arithmetic shift SUBF3 Subtract floating-point values 

CMPF3 Compare floating-point values SUBI3 Subtract integers 

CMPI3 Compare integers TSTB3 Test bit fields 

LSH3 Logical shift XOR3 Bitwise exclusive-OR 

MPYF3 Multiply floating-point values 


he C4x instruction set is a superset 0 the ‘Cax instruction set. The instructions marked are C4x-speci IC. 
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14.1.4 Program Control Instructions 


The program-control instruction group consists of all of those instructions (24) 
that affect program flow. The repeat mode allows repetition of a block of code 
(RPTB and RPTBD) or of a single line of code (RPTS). Both standard and 
delayed (single-cycle) branching are supported. Several of the program con- 
trol instructions are capable of conditional operations (see Section 14.2 for de- 
tailed information on condition codes). Table 14—4 lists the program control in- 
structions. 


Table 14—4. Program Control Instructions 


Instruction Description Instruction Description 
Bcond Branch conditionally (standard) LAut Link and jump 
BcondAFt pa conditionally delayed and annul | a \cond, Link and jump conditional 


Branch conditionally delayed and annul 


BcondATt LATcond, Link and trap conditional 

if true 
BcondD Branch conditionally (delayed) NOP No operation 
BRt Branch unconditionally (standard) RETIcond Return from interrupt conditionally 

sy RETI- , 
BRD+ Branch unconditionally (delayed) pone Return from trap or interrupt, delayed 
t 

CALL# Call subroutine RETScond Return from subroutine conditionally 
CALLcond Call subroutine conditionally RPTBt Repeat block of instructions 
DBeora Decrement and branch conditionally RPTBD Repeat block, delayed 

(standard) 
DBcondD Deere iientand branch conditionally RPTS Repeat single instruction 

(delayed) 
IACK Interrupt acknowledge SWI Software interrupt 
IDLE Idle until interrupt TRAPcond _ Trap conditionally 


The ’C4x instruction set is a superset of the ’C3x instruction set. The instructions marked are ’C4x-specific. 
t Operand addressing mode is incompatible with ’C3x. 
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14.1.5 Interlocked Operations Instructions 


The interlocked operations instructions support multiprocessor communica- 
tion and the use of external signals to allow for powerful synchronization mech- 
anisms. They also guarantee the integrity of the communication and result in 
a high-speed operation. Refer to Chapter 7 for examples of the use of inter- 
locked instructions. 


Table 14-5. Interlocked Operations Instructions 


Instruction Description Instruction Description 

LDFI Load floating-point value, interlocked STFI Store floating-point value, interlocked 
LDII Load integer, interlocked STII Store integer, interlocked 

SIGI Signal, interlocked 
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14.1.6 Parallel Operations Instructions 


The parallel-operations instructions group makes a high degree of parallelism 
possible. Some of the ’C4x instructions can occur in pairs that are executed 
in parallel. These instructions offer the following features: 


_j Parallel loading of registers 
Lj Parallel store 


_j Parallel arithmetic operations 


Lj Arithmetic/logical instructions used in parallel with a store instruction. 


Each instruction in a pair is entered as a separate source statement. The sec- 
ond instruction in the pair must be preceded by two vertical bars (||). 
Table 14—6 lists the valid instruction pairs. 


Table 14-6. Parallel Instructions 


(a) Parallel Arithmetic With Store Instructions 


Mnemonic Description 

ABSF|| STF Absolute value of a floating-point number and store floating-point value 
ABS]|| STI Absolute value of an integer and store integer 

ADDF3]|| STF Add floating-point values and store floating-point value 

ADDI3]|| STI Add integers and store integer 

AND3|| STI Bitwise-logical AND and store integer 

ASH3|] STI Arithmetic shift and store integer 

FIX|| STI Convert floating-point to integer and store integer 

FLOAT|| STF Convert integer to floating-point value and store floating-point value 
FRIEEE|| STFt Convert IEEE floating-point format and store 

LDF|| STF Load floating-point value and store floating-point value 

LDI|| STI Load integer and store integer 


T The 'C4x instruction set is a superset of the ’C3x instruction set. The instructions marked are ’C4x-specific. 


Assembly Language Instructions 14-9 


Instruction Set 


Table 14-6. Parallel Instructions (Concluded) 


(a) Parallel Arithmetic With Store Instructions (Continued) 


Mnemonic Description 

LSH3]|| STI Logical shift and store integer 

MPYF3]|| STF Multiply floating-point values and store floating-point value 
MPYI3]| STI Multiply integer and store integer 

NEGF]| STF Negate floating-point value and store floating-point value 
NEGI|| STI Negate integer and store integer 

NOT|| STI Complement value and store integer 

OR3]| STI Bitwise-logical OR value and store integer 

STF|| STF Store floating-point values 

STI|| STI Store integers 

SUBF93]|| STF Subtract floating-point value and store floating-point value 


TOIEEE|| STFT Convert to IEEE format and store 
SUBI3|| STI Subtract integer and store integer 


XOR3]| STI Bitwise-exclusive OR values and store integer 


(b) Parallel Load Instructions 


Mnemonic Description 
LDF|| LDF Load floating-point 
LDI|| LDI Load integer 


(c) Parallel Multiply and Add/Subtract Instructions 


Mnemonic Description 
MPYF3|| ADDF3 — Multiply and add floating-point 


MPYF3|| SUBF3 Multiply and subtract floating-point 
MPY13]| ADDI3 Multiply and add integer 
MPYI3]| SUBI3 Multiply and subtract integer 


t The 'C4x instruction set is a superset of the ‘C3x instruction set. The instructions marked are ’C4x-specific. 
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14.1.7 Illegal Instructions 


The ’C4x has no illegal instruction detection mechanism. Fetching an illegal 
(undefined) code may result in the execution of an undefined operation. If TI 
TMS320 floating-point software tools are used, no illegal opcodes can be gen- 
erated. An illegal opcode can only be generated by the misuse of the tools, by 
an error in the ROM code, or by a defective RAM. 
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14.2 Condition Codes and Flags 


The ’C4x provides 20 condition codes (00000—10100, excluding 01011) that 
can be used with any of the conditional instructions, such as RETScond or 
LDF cond. The conditions include signed and unsigned comparisons, compari- 
sons to zero, and comparisons based on the status of individual condition 
flags. Note that all conditional instructions can also accept the suffix U to indi- 
cate unconditional operation. 


Seven condition flags provide information about properties of the result of 
arithmetic and logical instructions. The condition flags are stored in the status 
register (ST); the effect of an instruction on a condition flag depends on the val- 
ue of the SET COND field (bit 15 of the status register). The value of SET 
COND (0 or 1) does not affect the nature of the compare instructions (CMPF, 
CMPF3, CMPI, CMPI3, TSTB, or TSTB3). 


Lj) If SET COND = 0, the ST condition flags are set if the operation’s target 
is any extended-precision register (RO—R11) . 


Li) If SET COND =1, the ST condition flags are also set if the operation’s tar- 
get is any register in the primary register file except the status register. 


The condition flags can be modified by most instructions when either of the 
preceding conditions is established and either of the following two cases oc- 
curs: 


(J Aresult is generated when the specified operation is performed to infinite 
precision. This is appropriate for compare-and-test instructions that do not 
store results in a register. It is also appropriate for arithmetic instructions 
that produce underflow or overflow. 


(J The output is written to the destination register as shown in Table 14—7. 
This is appropriate for other instructions that modify the condition flags. 


Table 14—7. Output Value Formats 
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Type of 

Operation Output Format 

Floating-point 8-bit exponent, 1 sign bit, 31-bit fraction 
Integer 32-bit integer 

Logical 32-bit unsigned integer 


Condition Codes and Flags 


Figure 14—1 shows the condition flags in the low-order bits of the status regis- 
ter. Following the figure is a list of status register condition flags and descrip- 
tions on how the flags are set by most instructions. For specific details of the 
effect of a particular instruction on the condition flags, see the description of 
that instruction in subsection 14.3.3. 


Figure 14-1. Status Register 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 
Analysis 
R R R R R R R R R R R R R R R R 
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
cie | co | ce | cr | pcr] pm [ov] tur] wf] ur[ n | z [vio | 


R/W RW R/W R/W R/W R/W R/W R/W RWW SR/W R/C ER/MWSER/W SER/WSEiR/WsiR/W 


NOTE: xx = reserved bit. 
R = read, W = write. 


LUF _Latched Underflow Condition Flag. LUF is set whenever UF (floa- 
ting-point underflow flag) is set. LUF can be cleared only by a proces- 
sor reset or by modifying it in the status register (ST). 


LV Latched Overflow Condition Flag. LV is set whenever V (overflow 
condition flag) is set. Otherwise, it is unchanged. LV can be cleared 
only by a processor reset or by modifying it in the status register (ST). 


UF Floating-Point Underflow Condition Flag. A floating-point under- 
flow occurs whenever the exponent of the result is less than or equal 
to —128. If a floating-point underflow occurs, UF is set, and the output 
value is set to 0. UF is cleared if a floating-point underflow does not 
occur. 


N Negative Condition Flag. Logical operations assign N (the state of 
the MSB of the output value). For integer and floating-point opera- 
tions, N is set if the result is negative, and cleared otherwise. Zero is 
positive. 


Z Zero Condition Flag. For logical, integer, and floating-point opera- 
tions, Z is set if the output is 0, and cleared otherwise. 


V Overflow Condition Flag. For integer operations, V is set if the result 
does not fit into the format specified for the destination (i.e., — 2 32 < 
result <2 32-— 1). Otherwise, Vis cleared. For floating-point operations, 
V is set if the exponent of the result is greater than 127; otherwise,V is 
cleared. Logical operations always clear V. 
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Table 14-8. Condition 
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i?) 


Carry Flag. When an integer addition is performed, C is set if a carry 
occurs out of the bit corresponding to the MSB of the output. When an 
integer subtraction is performed, C is set if a borrow occurs into the bit 
corresponding to the MSB of the output. Otherwise, for integer opera- 
tions, C is cleared. The carry flag is unaffected by floating-point and 
logical operations. For shift instructions, this flag is set to the final val- 
ue shifted out; for a zero shift count, this is set to zero. 


Table 14—8 lists the condition mnemonic, code, description, and flag for each 
of the 19 condition codes. 


Codes and Flags 
(a) Unconditional Compares 
Condition Code Description Flagt 
U 00000 Unconditional Don’t care 


(b) Unsigned Compares 


Condition Code Description Flagt 
LO 00001 Lower than Cc 
LS 00010 Lower than or same as CORZ 
HI 00011 Higher than ~C AND ~Z 
HS 00100 Higher than or same as ~C 
EQ 00101 Equal to Z 
NE 00110 Not Equal to ~Z 


(c) Signed Compares 


Condition Code Description Flagt 
LT 00111 Less than N 
LE 01000 Less than or equal to NORZ 
GT 01001 Greater than ~N AND ~Z 
GE 01010 Greater than or equal to ~N 
EQ 00101 Equal to Z 
NE 00110 Not equal to ~Z 


(d) Compare to Zero 


Condition Code Description Flagt 
Z 00101 Zero Z 
NZ 00110 Not zero ~Z 
P 01001 Positive ~N AND ~Z 
N 00111 Negative N 
NN 01010 Nonnegative ~N 


T The ~ means logical complement (“not true” condition). 
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(e) Compare to Condition Flags 


Condition Code Description Flagt 
NN 01010 Nonnegative ~N 
N 00111 Negative N 
NZ 00110 Nonzero ~Z 
Z 00101 Zero Z 
NV 01100 No overflow ~V 
Vv 01101 Overflow V 
NUF 01110 No underflow ~UF 
UF 01111 Underflow UF 
NC 00100 No carry ~C 
C 00001 Carry C 
NLV 10000 No latched overflow ~LV 
LV 10001 Latched overflow LV 
NLUF 10010 No latched floating-point underflow ~LUF 
LUF 10011 Latched floating-point underflow LUF 
ZUF 10100 Zero or floating-point underflow Z OR UF 


T The ~ means logical complement (“not true” condition). 
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14.3 Individual Instruction Descriptions 


This section contains the individual assembly language instructions for the 
’C4x. The instructions are listed in alphabetical order. Information for each in- 
struction includes assembler syntax, operation, operands, encoding, descrip- 
tion, cycles, status bits, mode bit, and examples. 


Definitions of the symbols and abbreviations, as well as optional syntax forms 
allowed by the assembler, precede the individual instruction description sec- 
tion. Also, an example instruction shows the special format used and explains 
its content. 


You can find a functional grouping of the instructions, as well as a complete 
instruction set summary in Section 14.1. See Chapter 7, Addressing and Stack 
Management, for information on memory addressing. 


14.3.1 Symbols and Abbreviations 
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Table 14—9 lists the symbols and abbreviations used in the individual instruc- 
tion descriptions. 
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Table 14-9. Instruction Symbols 


Symbol Meaning 

src Source operand 

src Source operand 1 

src2 Source operand 2 

src3 Source operand 3 

src4 Source operand 4 

dst Destination operand 

dst Destination operand 1 

dst2 Destination operand 2 

disp Displacement 

cond Condition 

count Shift count 

G General addressing modes 

T Three-operand addressing modes 

P Parallel addressing modes 

B Conditional-branch addressing modes 
ARn Auxiliary register n 

IRn Index register n 

Rn Extended-precision register address n 
RC Repeat count register 

RE Repeat end address register 

RS Repeat start address register 

ST Status register 

Cc Carry bit of status register 

GIE Global interrupt enable bit of status register 
N Trap vector 

PC Program counter 

RM Repeat mode flag 

SP System stack pointer 

|x| Absolute value of x 

x7y Assign the value of x to destination y 
x(man) Mantissa field (sign + fraction) of x 
x(exp) Exponent field of x 

op1|| op2 Operation 1 performed in parallel with operation 2 
x AND y Bitwise-logical AND of x and y 

x ORy Bitwise-logical OR of x and y 

x XOR y Bitwise-logical XOR of x and y 

~x Bitwise-logical complement of x 
X<<y Shift x to the left y bits 

X>>y Shift x to the right y bits 

*44SP Increment SP and use incremented SP as address 
*SP-—- Use SP as address and decrement SP 
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14.3.2 Optional Assembler Syntaxes 
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The assembler allows a relaxed syntax form for some instructions. These op- 
tional forms simplify the assembly language so that special-case syntax can 
be ignored. The following is a list of these optional syntax forms. 


a 


The destination register can be omitted on unary arithmetic and logical op- 
erations when the same register is used as a source. For example, 


ABSI RO, RO can be written as ABSI RO 
Instructions affected: ABSI, ABSF, FIX, FLOAT, NEGB, NEGF, NEGI, 
NORM, NOT, RND. 

All three-operand instructions can be written without the 3. For example, 
ADDI3 RO,R1,R2 can be written as ADDI RO,R1,R2 


Instructions affected: ADDC3, ADDF3, ADDI3, AND3, ANDN3, ASH3, 
LSH3, MPYF38, MPYI3, OR3, SUBB3, SUBF3, SUBI3, XOR3, 
MPYSHI3, MPYUHIS. 


This also applies to all the pertinent parallel instructions. 


All three-operand comparison instructions can be written without the 3. 
For example, 


CMPI3 RO, *ARO can be written as CMPI RO, *ARO 
Instructions affected: CMPI3, CMPF3, TSTB3. 
Indirect operands with an explicit 0 displacement are allowed. In three-op- 


erand or parallel instructions, operands with 0 displacement are automati- 
cally converted to no-displacement mode. For example: 


LDI *+AR0(0),R1 is legal 
Also 
ADDI3 *+AR0(0),R1,R2_ Is equivalentto ADDI3 *ARO,R1,R2 


Indirect operands can be written with no displacement; in which case, a 
displacement of 1 is assumed. For example, 


LDI *ARO++(1),RO0 can be written as LDI *ARO++,RO 


All conditional instructions accept the suffix U to indicate unconditional op- 
eration. Also, the U can be omitted from unconditional short branch in- 
structions. For example: 


BU label can be written as B label 


Labels can be written with or without a trailing colon. For example: 


label0O: NOP 
labell NOP 
label2: (label assembles to next source line) 
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Empty expressions are not allowed for the displacement in indirect mode: 
LDI *+ARO(),RO_ is not legal 


Immediate-mode destination operands of BR and CALL can be writ- 
ten with an at(@) sign : 
BR label can be written as BR @label 


The LDP pseudo-op can be used to load a register (DP by default) with the 
16 MSBs of a relocatable address as follows: 


LDP addr, REG or LDP @addr,REG or LDP addr 
The at (@) sign is optional. 


LDP generates an LDIU instruction. An immediate operand with a special 
relocation type is used. 


Parallel instructions can be written in either order. For example: 


ADDI 
|| STI 


can be written as 


STI 
|| ADDI 


The parallel bars indicating part two of a parallel instruction can be written 
anywhere on the line from column 0 to the mnemonic. For example: 


ADDI 
SPL 


can be written as 
ADDI 
|| STI 


If the second operand of a parallel instruction is the same as the third (des- 
tination register) operand, the third operand can be omitted. This allows 
the writing of three-operand parallel instructions that look like normal two- 
operand instructions. For example, 


ADDI *RRO,R2,R2 
|| MPYI *AR1,RO,RO 


can be written as 


ADDI *ARO,R2 
|| MPYI *AR1,RO0 


Instructions affected (applies to all parallel instructions that have a register 
as the second operand): ADDI, ADDF, AND, MPYI, MPYF, OR, SUBI, 
SUBF, XOR. 
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Lj All commutative operations in parallel instructions can be written in either 
order. For example, the ADDI part of a parallel instruction can be written 
in either of two ways: 


ADDI *ARO,R1,R2. or ADDI R1,*ARO,R2 


The instructions affected are parallel instructions containing any of the fol- 
lowing: ADDI, ADDF, MPYI, MPYF, AND, OR, XOR. 


Lj) Use the syntax in Table 14—10 to designate CPU registers in operands. 


14.3.3 Individual Instruction Descriptions 


Each assembly language instruction for the ’C4x is described in this section 
in alphabetical order. The description includes the assembler syntax, opera- 
tion, operands, encoding, description, cycles, status bits, mode bit, and exam- 
ples. Table 14—10 shows the CPU register symbols. 
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Table 14-10. CPU Register Symbols 


Register 
Machine 

Register Value 

Symbol (hex) Assigned Function Name Subsection Page 
RO 00 Extended-precision register 0 3.1.1 3-3 
R1 01 Extended-precision register 1 3.1.1 3-3 
R2 02 Extended-precision register 2 3.1.1 3-3 
R3 03 Extended-precision register 3 3.1.1 3-3 
R4 04 Extended-precision register 4 3.1.1 3-3 
R5 05 Extended-precision register 5 3.1.1 3-3 
R6 06 Extended-precision register 6 3.1.1 3-3 
R7 07 Extended-precision register 7 3.1.1 3-3 
R8 1c Extended-precision register 8 3.1.1 3-3 
RQ 1D Extended-precision register 9 3.1.1 3-3 
R10 1E Extended-precision register 10 3.1.1 3-3 
R11 1F Extended-precision register 11 3.1.1 3-3 
ARO 08 Auxiliary register 0 3.1.2 3-4 
AR1 09 Auxiliary register 1 3.1.2 3-4 
AR2 OA Auxiliary register 2 3.1.2 3-4 
AR3 0B Auxiliary register 3 3.1.2 3-4 
AR4 0C Auxiliary register 4 3.1.2 3-4 
AR5 0D Auxiliary register 5 3.1.2 3-4 
AR6 OE Auxiliary register 6 3.1.2 3-4 
AR7 OF Auxiliary register 7 3.1.2 3-4 
DP 10 Data-page pointer 3.1.3 3-4 
IRO 11 Index register 0 3.1.4 3-4 
IR1 12 Index register 1 3.1.4 3-4 
BK 13 Block-size register 3.1.5 3-5 
SP 14 System stack pointer 3.1.6 3-5 
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Table 14-10. CPU Register Symbols (Continued) 
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Register 
Symbol 


ST 
DIE 
IE 
IIF 
RS 
RE 
RC 
IVTP 
TVTP 


Register 
Machine 
Value 
(hex) 


15 
16 
17 
18 
19 
1A 
1B 
00 
01 


Assigned Function Name 

Status register 

DMA coprocessor interrupt enable 
Internal-interrupt enable register 
IIOF pins and interrupt flag register 
Repeat start address 

Repeat end address 

Repeat counter 

Interrupt-vector table pointer 


Trap-vector table pointer 


Subsection 
3.1.7 
3.1.8 
3.1.9 
3.1.10 
3.1.11 
3.1.11 
3.1.11 
3.2 
3.2 


Page 
3-5 
3-8 

3-11 

3-13 

3-16 

3-16 

3-16 

3-17 

3-17 


Syntax 


Operands 


Opcode 


Word Fields 


Example Instruction EXAMPLE 


INST src, dst 


or 


INST1 src2, dst? 
l| INST2 src3, dsi2 


Each instruction begins with an assembler syntax expression. Labels may be 
placed either before the command (instruction mnemonic) on the same line or 
on the preceding line in the first column. The optional comment field that con- 
cludes the syntax is not included in the syntax expression. A space is required 
between fields (label, command, operand, and comment fields). 


The syntax examples illustrate the common one-line syntax and the two-line 
syntax used in parallel addressing. Note that the two vertical bars || that indi- 
cate a parallel addressing pair can be placed anywhere before the mnemonic 
on the second line. The first instruction in the pair can have a label, but the sec- 
ond instruction cannot have a label. 


src general-addressing modes (G): 
dst register (RO — R11) 


The operands segment lists the types of operands that the instruction uses. 


31 24 23 1615 8 7 


oO 


3 1615 8 7 0 


1 24 23 
fe [weriTInsT] ds’ [ooo] wea | owe | oe — 


Encoding examples are shown for general addressing and parallel address- 
ing. The instruction pair for the parallel addressing example consists of INST1 
and INST2. Note that two separate opcodes are listed in this case; each 
instruction is 32-bits in length in the ’C4x. 


~~ G_srcaddressing modes 
"00 register(RO-R11) 
01 direct 
10 indirect 


11 immediate 
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Operation 


Description 


Status Bits 
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The word fields segment describes the addressing mode that corresponds to 
each value of a word field in the opcode. The word field listed in the table corre- 
sponds to the field listed under operands. 


|src | > dst 
or 


|src2 | > dst? 
|| src3— dst2 


The instruction operation sequence describes the processing that takes place 
when the instruction is executed. For parallel instructions, the operation se- 
quence is performed in parallel. Conditional effects of status register specified 
modes are listed for conditional instructions such as Bcond. 


dst register (any register in CPU primary-register file) 
or 


src2_ indirect (disp = 0, 1, IRO, IR1) 
dst? register (RO—-R7) 
src3__ register (RO—-R7) 
dst2 indirect (disp = 0, 1, IRO, IR1) 


Operands are defined according to the addressing mode and/or the type of ad- 
dressing used. Note that indirect addressing uses displacements and the in- 
dex registers. See Chapter 6, Addressing, for detailed information on addres- 
sing. 

Instruction execution and its effect on the rest of the processor or memory con- 
tents are described in this segment. Any constraints on the operands imposed 


by the processor or the assembler are discussed. The description parallels 
and supplements the information given by the operation block. 


LUF __Latched Floating-Point Underflow Condition Flag. 1 if a float- 
ing-point underflow occurs, unchanged otherwise. 


LV Latched Overflow Condition Flag. 1 if an integer or floating-point 
overflow occurs, unchanged otherwise. 


UF Floating-Point Underflow Condition Flag. 1 if a floating-point un- 
derflow occurs, 0 otherwise. 


N Negative Condition Flag. 1 if anegative result is generated, 0 other- 
wise. In some instructions, this flag is the MSB of the output. 

Z Zero Condition Flag. 1 if a zero result is generated, 0 otherwise. For 
logical and shift instructions, 1 if a zero output is generated, 0 other- 
wise. 


Mode Bit 


Cycles 


Example 


Example Instruction EXAMPLE 


V Overflow Condition Flag. 1 if an integer or floating-point overflow oc- 
curs, 0 otherwise. 

Cc Carry Flag. 1 if acarry or borrow occurs, 0 otherwise. For shift instruc- 
tions, this flag is set to the value of the last bit shifted out; 0 for a shift 
count of 0. 


The seven condition flags are stored in the status register (ST). They provide 
information about the properties of the result or output of arithmetic or logical 
operations. 


OVM Overflow Mode Flag. In general, integer operations are affected by the 
OVM bit value. 


1 
The digit specifies the number of cycles required to execute the instruction. 


INST @98AEh,R5 


Before Instruction After Instruction 

DP DP sohL__ 
R5 | 076690 0000h| 2.30562500e + 02 R5 1.80126593e + 00 
Memory at 0080 98AEh Memory at 80 98AEh 

1.00001107e + 00 1.00001107e + 00 
LuF[_ LuFL_ 
v{ i 
UP[_ UF [| 
NC 2 
zf 4g re 
vo 4g ¥ Cg 
cL .  ————) 


The sample code presented in the above format shows the effect of the code 
on system pointers (e.g., DP or SP), registers (e.g., R1 or R5), memory at spe- 
cific locations, and the seven status bits. The values given for the registers in- 
clude the leading zeros to show the exponentin floating-point operations. Dec- 
imal conversions are provided for all register and memory locations. The 
seven status bits are listed in the order in which they appear in the assembler 
and simulator (see Section 14.2, Condition Codes and Flags, and Table 14-8 
on page 14-14 for further information on these seven status bits). 


Assembly Language Instructions 14-25 


ABSF Absolute Value of Floating-Point Number 


Syntax ABSF src, dst 


Operands src: general-addressing modes 
dst: register (RO — R11) 
Opcode 
3129 0 


23. 21 16 


Word Fields 


G src addressing modes 
00 register (RO — R11) 

01 direct 

10 indirect 


11 immediate 


Operation |src| > dst 


Description The absolute value of the src operand is loaded into the dst register. The src 
and dst operands are assumed to be floating-point numbers. 


An overflow occurs if src (man) = 8000 0000h and src (exp) = 7Fh. The result 
is dst (man) = 7FFF FFFFh and dst (exp) = 7Fh. 


Status Bits LUF Unaffected 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF 0 
N 0 
Z 1 if a zero result is generated, 0 otherwise 
Vv 1 if a floating-point overflow occurs, 0 otherwise 
Cc Unaffected 


Mode Bit OVM operation is affected by the OVM bit’s value. 
Cycles 1 


Example ABSF R4,R7 
Before Instruction After Instruction 
R4 -9.90337307e + 27 R4 -9.9033737e + 27 
R7 5.48527255e + 37 R7 9.90337307e + 27 
wv [0] w Lo] 
a> = Zz [0] 
v 9] v C9 


14-26 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel ABSF and STF ABSF||STF 


ABSF src2, dst1 
|| STF = src3, dst2 


src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO —R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


31 29 24 23 1615 8 7 0 


None. 


|src2 | > dst? 
|| src3 > dst2 


A floating-point absolute value and a floating-point store are performed in par- 
allel. All registers are read at the beginning and loaded at the end of the ex- 
ecute cycle. This means that if one of the parallel operations (STF) reads from 
a register and the operation being performed in parallel (ABSF) writes to the 
same register, then STF accepts as input the contents of the register before 
it is modified by the ABSF. 


If src2 and dsi2 point to the same location, srcZ2is read before the write to dst2. 
If src3 and dst7 point to the same register, src3 is read before the write to dst7. 


An overflow occurs if src (man) = 8000 0000h and src (exp) = 7Fh. The result 
is dst (man) = 7FFF FFFFh and dst (exp) = 7Fh. 


LUF Unaffected 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF 0 


N 0 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if a floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-27 


ABSF||STF Parallel ABSF and STF 


Example 


14-28 


ABSF *++AR3(IR1) ,R4 
[| STF  R4,*-AR7(1) 


Before Instruction 


AR3 
IR1 
R4 1.79750e + 02 
AR7[____80 98C5h] 


Data at 80 98AFh 


58B 4000h| —6.118750e + 01 


Data at 80 98C4h 


Oh 
LuF[__ 
vf 
uF [_ 
N [L_ 
z CL 
v Lo 
a) rt) 


After Instruction 


AR3|___80 98AFh | 
IR1 [____OAFh | 
R4 [__574C0 0000h | 
AR7[____80 98C5h| 


Data at 80 98AFh 


6.118750e + 01 


58B 4000h| -6.118750e + 01 


Data at 80 98C4h 
[___733 Co00h | 
LUFL_ 
i 
UF LO 
N LO] 
z [| 
v Lo] 
c Lol 


1.79750e + 02 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Absolute Value of Integer ABSI 


ABSI src, dst 


src: general-addressing modes 
dst: _ register (any register in CPU primary-register file) 


31 24 23 1615 8 7 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


|src| > dst 


The absolute value of the src operand is loaded into the dst register. The src 
and dst operands are assumed to be signed integers. 


An overflow occurs if src = 8000 0000h. If ST(OVM) = 1, the result is 
dst=7FFF FFFFh. If ST(OVM) = 0, the result is dst = 8000 0000h. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 0 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is affected by OVM bit value. 
{ 


Assembly Language Instructions 14-29 


ABSI Absolute Value of Integer 


Example 1 ABSI RO, RO 
or ABSI RO 


Before Instruction 


RO OFFFF FFCBh 


Example 2 ABSI *AR1,R3 


Before Instruction 


ARI 
R3 


Data at 20h 


OFFFF FFCBh 


14-30 


After Instruction 


—53 RO 035h 


After Instruction 


ARt 
R3 


Data at 20h 


-53 OFFFF FFCBh 


53 


53 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel ABSI and ST! ABSI||STI 


ABSI _ src2, dst1 
|| STI src3, dst2 


src2: indirect 
dst1: register 
src3: register 
dst2: indirect 


— 


disp = 0, 1, IRO, IR1) 
RO - R7) 
RO - R7) 
disp = 0, 1, IRO, IR1) 


=~ —_~ —~ 


3 7 0 


1 24 23 1615 8 
footer] on [ooo] ma | ma | wa | 


None 


|src2 | > dst1 
|| src3 > dst2 


An integer absolute value and an integer store are performed in parallel. All 
registers are read at the beginning and loaded at the end of the execute cycle. 
This means that if one of the parallel operations (STI) reads from a register and 
the operation being performed in parallel (ABSI) writes to the same register, 
then STI accepts as input the contents of the register before it is modified by 
the ABSI. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


An overflow occurs if src = 8000 0000h. If ST(OVM) = 1, the result is dst = 
7FFF FFFFh. If ST(OVM) = 0, the result is dst = 8000 0000h. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 0 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is affected by OVM bit value. 
{ 


Assembly Language Instructions 14-31 


ABSI||STI = Parallel ABS! and ST! 


Example ABSI *-AR5(1),R5 
|| STI R1, *AR2--(IR1) 


Before Instruction After Instruction 


14-32 


ARS|____80 99E2h| ARS |___80 99E2h | 
R5 R5 
R1 66 R1 42h 
AR2[____80 98FFh] AR2|____80 98F0h| 
IR1 OFh IR1 OFh 
Data at 80 99Eth Data at 80 99E1h 
“63 
Data at 80 98FFh Data at 80 98FFh 
2h 2 42h 
LuF[___0] LuF[ 
vf iO wt 
ud ur [9] 
N [OJ N [Lo ] 
Zz CLD z Lo] 
v 9] v [Lo] 
ef. 9 c [4] 


53 
66 


66 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Add Integer With Carry ADDC 


ADDC src, dst 


src: general-addressing modes 
dst: register (any register in CPU primary-register file) 


24 23 1615 8 7 


31 
oooooor ofa] w | —S~™ 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst + src+C — dst 


The sum of the dst and src operands and the C (carry) flag is loaded into the 
dst register. The dst and src operands are assumed to be signed integers. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 


registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if acarry occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


Assembly Language Instructions 


14-33 


ADDC Aad Integer With Carry 


Example ADDC R1,R5 
Before Instruction After Instruction 
R1 -41 947 Ri -41 947 
R5 -65 122 R5 —107 068 
LUF[_ LuF[_ 
wv [oo wf 
UF [ UF [ 
Nn [0] N [0] 
z [0] | ss || 
v ~ Ve |=) 
oie ——— = 9] 7 rr 


14-34 


Syntax 
Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


Add Integer With Carry, 3 Operands 


ADDC3 © src2, src1, dst 


src1, src2: 
dst: 


ADDC3 


type 1 or type 2 three-operand addressing modes 


register mode (any register in CPU primary-register file) 


Type 1 


31 


24 23 1615 


8 7 0 


001000000 T dst srct src2 


Type 2 


31 


24 23 1615 


8 7 0 


T  srce1 addressing modes src2 addressing modes 

00 register mode (any CPU register) register mode (any CPU register) 

01 indirect mode (disp = 0, 1, 1RO,!IR1) — register mode (any CPU register) 

10 register mode (any CPU register) indirect mode (disp = 0, 1, IRO, IR1) 
11. indirect mode (disp = 0, 1, !RO,IR1) — indirect mode (disp = 0, 1, IRO, IR1) 
T src? addressing modes src2 addressing modes 
00 register mode (any CPU register) 8-bit signed immediate 

. : indirect mode *+ARn(5-bit unsigned 

01 register mode (any CPU register) displacement) 

10 indirect mode *+ARn(5-bit unsigned 8-bit signed immediate 


displacement) 


indirect mode *+ARn1(5-bit unsigned 
displacement) 


src1 + src2+C —- dst 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


Assembly Language Instructions 14-35 


ADDC3 = Aad Integer With Carry, 3 Operands 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-36 


The sum of the src? and src2 operands and value of the C (carry) flag is loaded 
into the dst register. The src7, src2, and dst operands are assumed to be 
signed integers. 


If ST (SET COND) = 0, the condition flags are modified if the destination regis- 
ter is RO — R11. lf ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 

LV 1 if an integer overflow occurs, unchanged otherwise 
U 0 

N 1 if a negative result is generated, 0 otherwise 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a carry occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


None 


Add Floating-Point Values ADDF 


Syntax ADDF src, dst 
Operands src: general-addressing modes 
dst: register (RO — R11) 
Opcode 
31 24 23 1615 87 0 
fooooo oor a] mw | —~*s 
Word Fields 
G src addressing modes 
00 register (RO-R11) 
01 direct 
10 indirect 
11 immediate 
Operation dst + src > dst 
Description The sum of the dstand src operands is loaded into the dstregister. The dstand 
src operands are assumed to be floating-point numbers. 
Status Bits LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an floating-point overflow occurs, 0 otherwise 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 


Assembly Language Instructions 14-37 


ADDF Aad Floating-Point Values 


Example ADDF *AR4++(IR1),R5 
Before Instruction After Instruction 

AR4[_____ 80. 9800h] AR4|___80 992Bhh | 
IR1 66 IR1 
R5 6.23750e + 01 R5 5.3268750e + 02 
Data at 80 9800h Data at 80 9800h 

4.7031250e + 02 4.7031250e + 02 
LUF[ LuF[_ 
v [oo iy: [= Se] 
UF [ UF [ 
v Cd vn CO 
z Loo et | 
i v fo 
i en") C= 30) 


14-38 


Add Floating-Point Values, 3 Operands ADDF3 


Syntax ADDFS3 _ src2, src1, dst 
Operands src1, src2: type 1 or type 2 three-operand addressing modes 
dst: register mode (RO — R11) 
Opcode 
Type 1 
31 24 23 1615 87 0 
Type 2 
31 24 23 1615 87 0 


Word Fields 
Type 1 
T src? addressing modes src2 addressing modes 
00 register mode (RO — R11) register mode (RO — R11) 
01 indirect mode (disp = 0, 1, !1RO,IR1) = register mode (RO — R11) 
10 register mode (RO — R11) indirect mode (disp = 0, 1, IRO, IR1) 
11. indirect mode (disp = 0, 1,1R0,IR1) — indirect mode (disp = 0, 1, IRO, IR1) 
Type 2 
T  src?1 addressing modes src2 addressing modes 
01 register mode (any CPU register) indirect mode "+ARn(5-bit unsigned 
displacement) 
11 indirect mode *+ARn1(5-bit unsigned indirect mode *+ARn2(5-bit unsigned 
displacement) displacement) 
Operation src1 + src2 > dst 


Assembly Language Instructions 14-39 


ADDF3 = Add Floating-Point Values, 3 Operands 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-40 


The sum of the src7 and src2 operands is loaded into the dstregister. The src7, 
src2, and dst operands are assumed to be floating-point numbers. 


LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 

N 1 if a negative result is generated, 0 otherwise 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


ADDF3 *+AR1(2),*+AR1(8),R4 


Before Instruction After Instruction 

ARI ARI 
R4 R4 [ 070DB2 0000h | 1.41695313e + 02 
Data at 22F F822h Data at 22F F828h 

1.28940e + 02 1.27590e + 01 
LUF[_ LuF[_ 
v C9] w [9] 
UF [ UF [OO 
n 9] Nn [0] 
z C9] z [9] 
vy Lo v Lo] 
ec [TO c Lo] 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel ADDF3 and STF_ ADDF3\||STF 


ADDF3 © src2, src1, dst? 
|| STF src3, dst2 


src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO — R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


31 24 23 1615 87 0 
00110] dstt | sret dst2 src2 
None 


src? + src2 > dst1 
|| src3 > dst2 


A floating-point addition and a floating-point store are performed in parallel. All 
registers are read at the beginning and loaded at the end of the execute cycle. 
This means that if one of the parallel operations (STF) reads from a register 
and the operation being performed in parallel (ADDF3) writes to the same reg- 
ister, then STF accepts as input the contents of the register before it is modified 
by the ADDFS3. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


LUF 1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF 1 if a floating-point underflow occurs, 0 otherwise 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-41 


ADDF3||STF Parallel ADDF3 and STF 


Example 


14-42 


ADDF3 *+AR3(IR1),R2,R5 
R4, *AR2 


|| STE 


Before Instruction 


AR3|____ 80 9800h| 
IR4 
R2 
R5 
R4 
AR2[____80 98F3h] 


Data at 80 98A5h 
733 C000h 


Data at 80 98F3h 
oh 


LuF[__ 


1.4050e + 02 


6.281250e + 01 


1.79750e + 02 


After Instruction 


AR3[____ 80 9800h | 
IR4 
R2 
R5 
R4 


AR2 
Data at 80 98A5h 

Data at 80 98F3h 

LUF[ 
i a) 
uF Lo] 
N Lo] 
Zz 0] 
v L______9] 
—— | 


1.4050e + 02 
3.20250e + 02 
6.281250e + 01 


1.79750e + 02 


6.28125e + 01 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Add Integer ADDI 


ADDI src, dst 


src: general-addressing modes 
dst: _ register (any register in CPU primary-register file) 


31 24 23 1615 8 7 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst + src > dst 


The sum of the dst and src operands is loaded into the the dst register. The 
dst and src operands are assumed to be signed integers. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if acarry occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 


None 


Assembly Language Instructions 14-43 


ADDI Add Integer 


Example ADDI R3,R7 
Before Instruction After Instruction 

R3 -53 R3 
R7 53 R7 
LUFL_ LuF[_ 
vt sid wf] 
UF [__ UF [_ 

a) n [9] 
z [Lo Zz 
inf. v C9 
ec [Od c [Lo] 


14-44 


Syntax ADDI3 src2, src7, dst 
Operands src, src2: 
dst: 
Opcode 
Type 1 
31 24 23 1615 
001000010/T | ast 
Type 2 
31 24 23 1615 
001100010] T| ast 
Word Fields 
Type 1 
T src? addressing modes 
00 register mode (any CPU register) 
01 indirect mode (disp = 0, 1, IRO, IR1) 
10 register mode (any CPU register) 
11. indirect mode (disp = 0, 1, IRO, IR1) 
Type 2 
T  src1 addressing modes 
00 register mode (any CPU register) 
01 register mode (any CPU register) 
10 indirect mode *+ARn(5-bit unsigned 
displacement) 
11 indirect mode *+ARn1(5-bit unsigned 
displacement) 
Operation src1 + src2 > dst 


Assembly Language Instructions 


Add Integer, 3 Operands ADDI3 


type 1 or type 2 three-operand addressing modes 


register mode (any register in CPU primary-register file) 


8 7 0 
src1 src2 
8 7 0 


src2 addressing modes 

register mode (any CPU register) 

register mode (any CPU register) 

indirect mode (disp = 0, 1, IRO, IR1) 
( 


indirect mode (disp = 0, 1, IRO, IR1) 


src2 addressing modes 
8-bit signed immediate 


indirect mode *+ARn(5-bit unsigned 
displacement) 


8-bit signed immediate 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


14-45 


ADDI3 Add Integer, 3 Operands 


Description The sum of the src7 and src2 operands is loaded into the dstregister. The src, 
src2, and dst operands are assumed to be signed integers. 

Status Bits If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 


UF 0 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
Vv 1 if an integer overflow occurs, 0 otherwise 
Cc 1 if a carry occurs, 0 otherwise 
Mode Bit OVM operation is affected by OVM bit value. 
Cycles 1 
Example None 


14-46 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel ADDI3 and ST! ADDI3||STI 


ADDI3  src2, src1, dst? 
|| STI src3, dst2 


src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO —R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 8 7 0 


31 
001141 src1 src3 dst2 src2 


None 


src1 + src2 > dst1 
|| src3 > dst2 


An integer addition and an integer store are performed in parallel. All registers 
are read at the beginning and loaded at the end of the execute cycle. This 
means that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (ADDI3) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by the 
ADDI3. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if acarry occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


Assembly Language Instructions 14-47 


ADDI3||STI = Parallel ADDI3 and ST! 


Example ADDI3 *ARO--(IRO),R5,RO 
[| sTz R3, *ART 
Before Instruction After Instruction 

ARO|___ 80 992Ch] ARO[____ 80 9920h | 
IRO IRO 
R5 220 R5 220 
RO RO 520 
R3 53 R3 53 
AR7 [____80 983Bh] AR7|___ 80 983Bh| 
Data at 80 992Ch Data at 80 992Ch 

300 300 
Data at 80 983Bh Data at 80 983Bh 

Oh 35h 53 

LUFL__ LuF[_ 
vft__—sidd| i ) 
ur [___ uF Lo 
N Lo N [Lo] 
z Lio Zl 0) 
a ee 2. re 1| 
| er a re 


14-48 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Bitwise Logical-AND AND 


AND src, dst 


src: general-addressing modes 
dst: register (any register in CPU primary-register file) 


24 23 1615 8 7 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst AND src > dst 


The bitwise-logical AND between the dst and src operands is loaded into the 
dst register. The dst and src operands are assumed to be unsigned integers. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-49 


AND Bitwise Logical-AND 


Example AND R1,R2 
Before Instruction After Instruction 
R1 Ri 80h 
R2 R2 
wF[ LuF[___ 
vf wv Lo] 
uF [OO UF [iO 
Nn [0] N [0] 
z [Lo z Lo] 
¥ [0 v [Lo] 
c c 


14-50 


Syntax 
Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


Bitwise Logical-AND, 3 Operands 


AND3 src2, src1, dst 


src1, src2: 
dst: 


AND3 


type 1 or type 2 three-operand addressing modes 


register mode (any register in CPU primary-register file) 


Type 1 


31 


24 23 1615 


001000011/ T | ast 


Type 2 


31 


0011000114/ T | ast 


24 23 1615 


src? addressing modes 

register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 
register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 


src? addressing modes 
register mode (any CPU register) 


register mode (any CPU register) 


indirect mode *+ARn(5-bit unsigned 
displacement) 


indirect mode *+ARn1(5-bit unsigned 
displacement) 


src1 & src2 > dst 


8 7 
src1 src2 


oO 


8 7 0 


src2 addressing modes 

register mode (any CPU register) 
register mode (any CPU register) 
disp = 0, 1, IRO, IR1) 
disp = 0, 1, IRO, IR1) 


indirect mode ( 
( 


indirect mode 


src2 addressing modes 

8-bit signed immediate 

indirect mode *+ARn(5-bit unsigned 
displacement) 

8-bit signed immediate 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


Assembly Language Instructions 14-51 


AND3 | Bitwise Logical-AND, 3 Operands 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-52 


The bitwise logical-AND between the src? and src2 operands is loaded into 
the dstregister. The src7, src2, and dstoperands are assumed to be unsigned 
integers. The immediate src2 addressing mode is sign-extended. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Notice the difference between AND and AND3, in this example: 


AND3 80h, RO, RO RO=FFFF FFFFh RO=FFFF FF80h 
AND 80h, RO RO=FFFF FFFFh RO=0000 OO80h 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel AND3 and ST! AND3\|STI 


AND3 src2, src1, dst? 
|| STI src3, dst2 


src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO —R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


3 24 23 1615 8 7 0 


1 
01000 srct src3 dst2 src2 


None 


src? AND src2 > dst? 
|| src3 > dst2 


A bitwise-logical AND and an integer store are performed in parallel. All regis- 
ters are read at the beginning and loaded at the end of the execute cycle. This 
means that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (AND3) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by the 
ANDS3. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-53 


AND3||STI 


Example 


14-54 


Parallel AND3 and STI 


AND3 *+AR1(IRO),R4,R7 


|| STI R3,*AR2 


Before Instruction 


ARI 
IRO 
R4 
R7 
R3 
AR2[____ 80 983Fh] 


Data at 80 99F9h 
5C53h 


Data at 80 983Fh 
Oh 


luFL_ 


53 


After Instruction 


ARI 
IRO 
R4 
R7 
R3 
AR2[____ 80 983Fh] 


Data at 80 99F9h 


5C53h 


Data at 80 983Fh 
35h 


luF[ 


53 


53 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Bitwise Logical-AND With Complement ANDN 


ANDN src, dst 


src: general-addressing modes 
dst: _ register (any register in CPU primary-register file) 


31 24 23 1615 8 7 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst AND ~src > dst 


The bitwise-logical AND between the dstoperand and the bitwise-logical com- 
plement (~) of the src operand is loaded into the dst register. The dst and src 
operands are assumed to be unsigned integers. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-55 


ANDN_Bitwise Logical-AND With Complement 


Example ANDN @980Ch,R2 
Before Instruction After Instruction 

DP DP 
R2 R2 
Data at 80 980Ch Data at 80 980Ch 

LWF[_ LUuFL_ 
v C9] w [9] 
UF [__ UF [LO 
er] v [9] 
z 9 z [9] 
v Lo v Lo] 
c Lo ce ——— 40] 


14-56 


Syntax 
Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


Bitwise Logical-ANDN, 3 Operands ANDN3 


ANDNS src2, src1, dst 


src1, src2: type 1 or type 2 three-operand addressing modes 


dst: register mode (any register in CPU primary-register file) 
Type 1 
31 24 23 1615 87 0 


001000100/T | ast src1 sro2 


Type 2 
31 24 23 1615 8 


7 0 


T src? addressing modes src2 addressing modes 
00 register mode (any CPU register) register mode (any CPU register) 
01 indirect mode (disp = 0, 1, IRO,IR1) register mode (any CPU register) 
10 register mode (any CPU register) indirect mode (disp = 0, 1, IRO, IR1) 
11. indirect mode (disp = 0,1, !1R0,1IR1) — indirect mode (disp = 0, 1, IRO, IR1) 
T src? addressing modes src2 addressing modes 
00 register mode (any CPU register) 8-bit signed immediate 


indirect mode *+ARn(5-bit unsigned 


01 register mode (any CPU register) displacement) 


indirect mode *+ARn(5-bit unsigned 
displacement) 
indirect mode *+ARn1(5-bit unsigned indirect mode *+ARn2(5-bit unsigned 
displacement) displacement) 


8-bit signed immediate 


src? AND ~ src2 > dst 


Assembly Language Instructions 14-57 


ANDN3 | Bitwise Logical-ANDN, 3 Operands 


Description The bitwise-logical AND between the src? operand and the bitwise-logical 
complement (~) of the src2 operand is loaded into the dst register. The src7, 
src2, and dst operands are assumed to be unsigned integers. The immediate 
src2 addressing mode is sign-extended. 


Status Bits LUF Unaffected 
LV _ Unaffected 
UF 0 
N MSB of the output 
Z 1 if a zero result is generated, 0 otherwise 
Vv 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example None 


14-58 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Arithmetic Shift ASH 


ASH src_count, dst 


src_count: general-addressing modes 
dst. register (any register in CPU primary-register file) 


31 24 23 1615 8 7 0 


10 00) 00011%1;/ G dst src_count 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


count = 7 LSBs of src_count 
If (count = 0): 
dst << count > dst 
Else: 
dst >> |count | > dst 


The seven least-significant bits of the src_count operand constitute the 2s- 
complement shift count of up to 32 bits. 


If countis greater than 0, the dst operand is left-shifted by the value of count. 
Low-order bits shifted in are zero-filled, and high-order bits are shifted out 
through the C (carry) bit. 


Arithmetic left-shift: 


dst <— 0 


If countis less than 0, the dst operand is right-shifted by the absolute value of 
count. The high-order bits of the dst operand are sign-extended as it is right- 
shifted. Low-order bits are shifted out through the C (carry) bit. 


Arithmetic right-shift: 
sign of 


Assembly Language Instructions 14-59 


ASH _ = Arithmetic Shift 


If count is 0, no shift is performed, and the C (carry) bit is set to 0. The 
src_count and dst operands are assumed to be signed integers. 


Status Bits If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 

LUF Unaffected 

LV 1 if an integer overflow occurs, unchanged otherwise 

UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc Set to the value of the last bit shifted out. 0 for a shift count of 0 

Mode Bit OVM operation is not affected by OVM bit value. 

Cycles 1 

Example 1 ASH R1,R3 

Before Instruction After Instruction 

R1 16 Ri 10h 

R3 R3 
WF[_ LUFL__ 
vt LV 
UP [od UF [__O 
N Lo N 
z [Lo Z| 
v Lo v 
c Lo c Lol 

Example 2 ASH @98C3h,R5 


14-60 


Before Instruction 


DP [| 80h] 
R5 [_OAECO 000th 


Data at 80 98C3h 


OFFES8 
luF[__ 


16 


After Instruction 


pbP [80h | 
R5 [_OFFFF FFAEh 


Data at 80 98C3h 


OFFES8 
LuF[ 


Syntax 
Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


Arithmetic Shift, 3 Operands ASH3 


ASH3 src_count, src, dst 


src, stc_count type 1 or type 2 three-operand addressing modes 


dst register mode (any register in CPU primary register file) 
Type 1 

31 24 23 1615 87 ) 
001000101 dst src src_count 
Type 2 

31 24 23 1615 87 0 


T  src?1 addressing modes src2 addressing modes 
00 register mode (any CPU register) register mode (any CPU register) 


01 indirect mode (disp = 0, 1, IRO,IR1) — register mode (any CPU register) 

10 register mode (any CPU register) indirect mode (disp = 0, 1, IRO, IR1) 
11. indirect mode (disp = 0,1, !1R0,1IR1) — indirect mode (disp = 0, 1, IRO, IR1) 
T src? addressing modes src2 addressing modes 


00 register mode (any CPU register) 8-bit signed immediate 


indirect mode *+ARn(5-bit unsigned 


01 register mode (any CPU register) displacement) 


indirect mode *+ARn(5-bit unsigned 
displacement) 


indirect mode *+ARn1(5-bit unsigned indirect mode *+ARn2(5-bit unsigned 
displacement) displacement) 


8-bit signed immediate 


11 


count = 7 LSBs of src_count 
if (count = 0) 
src < < count > dst 


Else: 
src>> | count| — dst 


Assembly Language Instructions 14-61 


ASH3 = Arithmetic Shift, 3 Operands 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-62 


The seven least-significant bits of the src_count operand constitute the 2s- 
complement shift count. 


If countis greater than 0, the src operand is left-shifted by the value of count. 
Low-order bits shifted in are zero-filled, and high-order bits are shifted out 
through the status register’s C (carry) bit. 


Arithmetic left-shift: 


Src +0 


If countis less than 0, the src operand is right-shifted by the absolute value of 
count (e.g. —4 = right-shift 4). The high-order bits of the src operand are sign- 
extended as they are right-shifted. Low-order bits are shifted out through the 
C (carry) bit. 


Arithmetic right-shift: 


If count is 0, no shift is performed, and the C (carry) bit is set to 0. The 
src_count, src, and dst operands are assumed to be signed integers. 


LUF Unaffected 

LV 1 if an integer overflow occurs, unchanged otherwise 

UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc Set to the value of the last bit shifted out. 0 for a shift count of 0 


OVM operation is not affected by OVM bit value. 
{ 


None 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Parallel ASH3 and ST|_ ASH3||STI 


ASH3 — src_count, src2, dst1 
|| STI src3, dst2 


src2 << count > dst1 

Else: 

src2 >> |count| > dst? 

|| src3 > dst2 

src_count register (RO — R7) 


src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO —R7) 
src3: register (RO —R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


3 24 23 1615 8 7 0 


1 
01001 | dstt | src_count src3 dst2 src2 


None 


count = 7 LSBs of src_count 
If (count = 0): 


The seven least-significant bits of the src_count operand register constitute 
the 2s-complement shift count of up to 32 bits. 


If countis greater than 0, the dst operand is left-shifted by the value of count. 
Low-order bits shifted in are zero-filled, and high-order bits are shifted out 
through the C (carry) bit. 


Arithmetic left-shift: 


me dee 


If countis less than 0, the dst operand is right-shifted by the absolute value of 
count. The high-order bits of the dst operand are sign-extended as it is right- 
shifted. Low-order bits are shifted out through the C (carry) bit. 


Arithmetic right-shift: 


If countis 0, no shiftis performed, and the C (carry) bitis setto 0. The src_count 
and dst operands are assumed to be signed integers. 


Assembly Language Instructions 14-63 


ASH3||STI 


Status Bits 


Mode Bit 
Cycles 


Example 


14-64 


Parallel ASH3 and STI 


All registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STI) reads from a regis- 
ter and the operation being performed in parallel (ASH3) writes to the same 
register, then STI accepts as input the contents of the register before it is modi- 
fied by the ASH3. If src2and dst2 point to the same location, src2is read before 
the write to dsi2. 


LUF Unaffected 

LV 1 if an integer overflow occurs, unchanged otherwise 

UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc Set to the value of the last bit shifted out. 0 for a shift count of 0 


OVM operation is not affected by OVM bit value. 
{ 


ASH3 R1,*AR6++(IR1),RO 
|| STI  R5,*AR2 


Before Instruction After Instruction 

AR6|____80 9900h| AR6|___80 998Ch | 
IR1 IR1 
RI 2400 RI -24 
RO Ro 
RS 53 RS 63 
AR2|___80 98A2h| AR2|____80 98A2h | 
Data at 80 9900h Data at 80 9900h 
Data at 80 98A2h Data at 80 98A2h 

Oh 53 
7 LuF[____0] 
vw C9 w 9] 
UP L__ UF [__ 
N [J N 
z Lio z Lol 
v Lo v Lol 
c Lo i 1 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Branch Conditionally (Standard) Bcond 


Bcond src 


src: _conditional-branch-addressing modes (B) 


3 24 23 1615 8 7 0 


1 
01101 0) BJ/0 00 | 0 | cond register or displacement 


B src addressing modes 
0 register 
1 PC relative 


If cond is true: 
If src is in register-addressing mode (any register in CPU primary- 
register file), 
src > PC. 
If srcis in PC-relative mode (label or address), displacement + PC + 1 — PC. 
Else, continue. 


Bcond signifies a standard branch that executes in four cycles. A branch is per- 
formed if the condition is true (since a pipeline flush also occurs on a true condi- 
tion; see Section 8.2 on page 8-4). If the src operand is expressed in register- 
addressing mode, the contents of the specified register are loaded into the PC. 
If the src operand is expressed in PC-relative mode, the assembler generates 
adisplacement: displacement = label —(PC of branch instruction + 1). This dis- 
placement is stored as a 16-bit signed integer in the 16 least-significant bits 
of the branch instruction word. This displacement is added to the PC of the 
branch instruction plus 1 to generate the new PC. 


The ’C4x provides 20 condition codes that can be used with this instruction 
(see Section 14.2 for a list of condition mnemonics, encoding, and flags). 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 


4 (Regardless of whether or not the branch is taken) 


Assembly Language Instructions 14-65 


Bcond = Branch Conditionally (Standard) 


Example BZ RO 
Before Instruction After Instruction 

PC PC 
RO RO 
wF[_ LuFL__ 
vt wlio 
UP [__ UF [__O 
N Lo N Lo] 
Zz Z 
Y=] v [= 70) 
c Lo c Lol 
Note: 


If a BZ instruction is executed immediately following a RND instruction with 
a zero operand, the branch is not performed, because the zero flag is not set. 
To circumvent this problem, execute a BZUF instead of a BZ instruction. 


14-66 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Branch Conditionally Delayed and Annul If False BcondAF 


BcondAF src 


src: _conditional-branch-addressing modes 


31 24 23 1615 87 0 


01101 o| BI 0101 register or displacement 


B src addressing modes 
0 register 
1 PC relative 


If (cond is true) 
If (src is a register) 
src > PC 
If (src is in PC-relative mode) 
displacement + PC of branch + 3 > PC 
Else: 
If (cond is false) 
annul the effect of the execute phase of the first following instruction and the 
effect of the read and execute phases of the second and third following 
instructions and continue. 


If the condition is true, a branch and the three instructions following the branch 
instruction are executed. If the condition is false, no branch is performed, and 
the effect of the execute phase of the first following instruction and of the read 
and execute phases of the second and third following instructions is annulled. 
The three instructions following BcondAF do not affect the cond. If the src oper- 
and is in register mode, then the contents of the specified register are loaded 
into the PC. If the src operand is in PC-relative mode, then the sum of the PC 
of the branch instruction + 3 and the displacementis loaded into the PC. In PC- 
relative mode the displacement field is interpreted as a 16-bit signed integer. 


None of the three instructions following the BcondAF can be an instruction that 
modifies the program flow. Interrupts are disabled for the duration of the 
BcondAF instruction. 


BcondAF especially is useful for controlling the exit at the bottom of aloop. Use 
caution when using instructions such as PUSH/POP, LDPK, or LDA that can 
modify registers like ARn, SP, and DP in the decode and/or read phase. This 
also applies when using instructions to perform indirect addressing with ARn 
modification. 


Assembly Language Instructions 14-67 


BcondAF Branch Conditionally Delayed and Annul If False 


Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example None 


14-68 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Branch Conditionally Delayed and Annul If True BcondAT 


BcondAT src 


src conditional-branch-addressing modes 


31 24 23 1615 8 7 0 


01101 0! BJO 0 114} cond register or displacement 


B src addressing modes 
0 register 
1 PC relative 


If (cond is true) 
If (src is a register) 
src > PC 
annul the effect of the execute phase of the first following instruction and of 
the read and execute phases of second and third following instructions. 
If (src is in PC-relative mode) 
displacement + PC of branch +3 > PC 
annul the effect of the execute phase of the first following instruction and of 
the read and execute phases of second and third following instructions. 
Else, continue. 


If the condition is true, a branch is performed, and the effect of the execute 
phase of the first following instruction and of the read and execute phases of 
second and third following instructions is anulled. The three instructions fol- 
lowing BcondAT do not affect the cond. If the src operand is expressed in regis- 
ter mode, then the contents of the specified register are loaded into the PC. 
If the src operand is in PC-relative mode, then the sum of the PC of the branch 
instruction + 3 and the displacement are loaded into the PC. In PC-relative 
mode, the displacement field is interpreted as a 16-bit signed integer. 


None of the three instructions following BcondAT can be an instruction that 
modifies the program flow. Interrupts are disabled for the duration of BcondAT}. 


The BcondAT instruction does not annul the status signals at the external inter- 
faces. Be especially careful when using instructions such as PUSH/POP, 
LDPK, or LDA that can modify registers like ARn, SP, and DP in the decode 
and/or read phase. This also applies when you use instructions to perform indi- 
rect addressing with ARn modification. BcondAT particularly is useful for con- 
trolling the entry at the top of the loop. 


Assembly Language Instructions 14-69 


BcondAT Branch Conditionally Delayed and Annul If True 


Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example None 


14-70 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Branch Conditionally (Delayed) BcondD 


BcondD src 
src: _conditional-branch-addressing modes (B) 


31 24 23 1615 8 7 0 


01101 0) BJO 00 cond register or displacement 


B src addressing modes 
0 register 
1 PC relative 


If cond is true: 
If src is in register-addressing mode (any register in CPU primary- 
register file) 
src > PC. 
If srcin PC-relative mode (label or address), displacement + PC +3 > PC. 
Else, continue. 


BcondabD signifies a delayed branch, allowing the three instructions after the 
delayed branch to be performed before the PC is modified. The effect is a sing- 
le-cycle branch, and the three instructions following BconaD do not affect the 
cond. 


None of the three instructions following BconaD should be an instruction that 
modifies program flow. Interrupts are disabled for the duration of BconaD. 


A branch is performed if the condition is true. If the src operand is expressed 
in register-addressing mode, the contents of the specified register are loaded 
into the PC. If the srcoperand is expressed in PC-relative mode, the assembler 
generates a displacement: displacement = src — (PC of branch instruction + 
3). This displacement is stored as a 16-bit signed integer in the 16 least-signifi- 
cant bits of the branch instruction. This displacement is added to the PC of the 
branch instruction plus 3 to generate the new PC. The ’C4x provides 20 condi- 
tion codes that can be used with this instruction (see Section 14.2 for a list of 
condition mnemonics, encoding, and flags). 


LUF Unaffected 
LV Unaffected 


UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


Assembly Language Instructions 14-71 


BcondD Branch Conditionally (Delayed) 


Mode Bit OVM operation is not affected by OVM bit value. 

Cycles 1 

Example BNZD 36 (36 = 24h) 

Before Instruction After Instruction 

PC PC 
LUPE 0] LuF[_ 
ye a] Wy [-_— 6] 
UF Ld UF [0 
Nf N [ol 
Zz Zz 
v. \[ — —] e 7) 
es.) c Lol 


14-72 


Syntax 
Operands 
Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Branch Unconditionally (Standard) BR 


BR src 


src. in PC-relative mode 


31 24 23 1615 87 0 
None 


PC + 1 + displacement > PC 


Performs an unconditional branch. The assembler generates a displacement: 
displacement = src — (PC of branch instruction + 1). This displacement is 
stored as a 24-bit signed integer in the 24 least-significant bits of the branch 
instruction. This displacement is added to the PC of the branch instruction plus 
1 to generate the new PC. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Zz Unaffected 
Vv Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
4 


None 


Assembly Language Instructions 14-73 


BRD Branch Unconditionally (Delayed) 


Syntax BRD src 
Operands src: in PC-relative mode 
Opcode 
31 24 23 1615 87 0 


Word Fields None 
Operation PC +3 + displacement > PC 


Description Performs an unconditional delayed branch. The assembler generates a dis- 
placement: displacement = src— (PC of branch instruction + 3). This displace- 
ment is stored as a 24-bit signed integer in the 24 least significant bits of the 
branch instruction. This displacement is added to the PC of the branch instruc- 
tion plus 3 to generate the new PC. Interrupts are disabled during the BRD 
instruction. 


The three instructions following the BRD instruction are fetched and executed. 
None of these three instructions should modify the program flow (e.g., affect 
the PC value). 


Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example None 


14-74 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Call Subroutine CALL 


CALL src 


src. in PC-relative mode 


31 24 23 1615 87 0 
None 


Next PC > *(++SP) 
PC + 1 + displacement > PC 


Performs a call. The next PC value is pushed onto the system stack. The as- 
sembler generates a displacement: displacement = src — (PC of branch in- 
struction + 1). This displacement is stored as a 24-bit signed integer in the 24 
least significant bits of the branch instruction. This displacement is added to 
the PC of the branch instruction plus 1 to generate the new PC. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Zz Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
4 


None 


Assembly Language Instructions 14-75 


CALLcond Call Subroutine Conditionally 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-76 


CALLcond src 


src: _conditional-branch-addressing modes (B) 


31 24 23 1615 87 0 


01110 0/ B|0 00 0O| cond register or displacement 


B src addressing modes 
0 register 
1 PC relative 


If cond is true: 
Next PC — *++SP 
If src is in register-addressing mode (any register in CPU primary- 
register file), 
src > PC. 
If srcin PC-relative mode (label or address), displacement + PC + 1 — PC. 
Else, continue. 


A call is performed if the condition is true. If the condition is true, the next PC 
value is pushed onto the system stack. If the srcoperand is expressed in regis- 
ter-addressing mode, the contents of the specified register are loaded into the 
PC. lf the src operand is expressed in PC-relative mode, the assembler gener- 
ates a displacement: displacement = label — (PC of call instruction + 1). This 
displacement is stored as a 16-bit signed integer in the 16 least-significant bits 
of the call instruction word. This displacement is added to the PC of the call 
instruction plus 1 to generate the new PC. 


The ’C4x provides 20 condition codes that can be used with this instruction 
(see Section 14.2 for a list of condition mnemonics, encoding, and flags). 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 


5 (Regardless of whether the condition is true or not) 


Call Subroutine Conditionally CALLcond 


Example CALLNZ R5 
Before Instruction After Instruction 
PC PC 
SP sP 
R5 R5 
Data at 9836h 
wF[ LwFL__ | 
vf vw Lol 
U[_ UF [0 
N Lg N Lo] 
7. = 10 a, | 
v Lo v Lo] 
GS =] Gy — — - -70] 


Assembly Language Instructions 14-77 


CMPF Compare Floating-Point Values 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-78 


CMPF src, dst 


src: general-addressing modes (G): 
dst: register (RO — R11) 


31 24 23 1615 87 0 


G src addressing modes 
00 register (RO — R11) 

01 direct 

10 indirect 

11 immediate 


dst — src 


The src operand is subtracted from the dst operand. The result is not loaded 
into any register; this allows for nondestructive compares. The dst and src op- 
erands are assumed to be floating-point numbers. 


LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 

N 1 if a negative result is generated, 0 otherwise 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if a floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Compare Floating-Point Values CMPF 


Example CMPF *+AR4,R6 
Before Instruction After Instruction 

AR4 AR4 
R6 1.4050e + 02 R6 1.4050e + 02 
Data at 80 98F3h Data at 80 98F3h 

1.4050e + 02 1.4050e + 02 
LUFL__ LUFL__ 
wv Lo wv Lo] 
Ce _____20] UF LO 
N N Lo] 
2: Zz = = =] 
v Lo v [Lo] 
c Lo c [ol 


Assembly Language Instructions 14-79 


CMPF3 Compare Floating-Point Values, 3 Operands 


Syntax 
Operands 
Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


Description 


Status Bits 


Mode Bit 


14-80 


CMPF3 src2, src1 


src1 —src2 type 1 or type 2 three-operand addressing modes 


Type 1 
31 24 23 1615 8 


7 0 


Type 2 
31 24 23 1615 8 


7 0 


T src? addressing modes src2 addressing modes 

00 register mode (RO — R11) register mode (RO — R11) 

01 indirect mode (disp = 0,1, !1R0,!IR1) register mode (RO — R11) 

10 register mode (RO — R11) indirect mode (disp = 0, 1, IRO, IR1) 
11 indirect mode (disp = 0,1, !1R0,IR1) — indirect mode (disp = 0, 1, IRO, IR1) 


T src? addressing modes src2 addressing modes 


indirect mode *+ARn(5-bit unsigned 


01 register mode (R11—R0) displacement) 


11 indirect mode *+ARn1(5-bit unsigned indirect mode *+ARn2(5-bit unsigned 
displacement) displacement) 


sre? — src2 


The src2 operand is subtracted from the src? operand. The result is not loaded 
into any register. This allows for nondestructive compares. The src? and src2 
operands are assumed to be floating-point numbers. 


LUF 1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 

N 1 if a negative result is generated, 0 otherwise 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if a floating-point overflow occurs, 0 otherwise 

Cc Unaffected 


OVM operation is not affected by OVM bit value. 


Compare Floating-Point Values, 3 Operands CMPF3 


Cycles 1 
Example CMPF3 *AR2, *AR3-(1) 
Before Instruction After Instruction 
AR2 AR2 
AR3 AR3 ent) 
Data at 809831h Data at 809831h 
2.5044e + 02 2.5044e + 02 
Data at 809852h Data at 809852h 
6.253125e + 01 6.253125e + 01 
LuF[_ LuF[_ 
wf wf 
UP [ UF [oO] 
N [oj N 
2 [= —— =] z Lo] 
v Lo v Lo] 
c Lo a et 


Assembly Language Instructions 14-81 


CMPI Compare Integer 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-82 


CMPI src, dst 


src: general-addressing modes 
dst: register (any register in CPU primary-register file) 


24 23 1615 87 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst — src 


The src operand is subtracted from the dst operand. The result is not loaded 
into any register; this allows for nondestructive compares. The dstand src op- 
erands are assumed to be signed integers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is not affected by OVM bit value. 
1 


Compare Integer CMPI 


Example CMPI R3,R7 
Before Instruction After Instruction 
R3 2200 R3 2200 
R7 1000 R7 1000 
LuF[__ LUFL_ 
vl wv Lo 
UPL UF [0 
N Lg N 
z Ld Z. ‘i; —— 70] 
v [Lo v [| 
c Lo c [ol 


Assembly Language Instructions 14-83 


CMPI3 Compare Integer, 3 Operands 


Syntax 
Operands 


Opcode 


Word Fields 


Type 


Type 2 


Operation 


Description 


14-84 


CMPI3 src2, src1 


src1 — src2 type 1 or type 2 three-operand addressing modes 


Type 1 


31 24 23 1615 8 


7 0 


Type 2 


31 24 23 1615 8 


7 0 


T  src1 addressing modes src2 addressing modes 
00 register mode (any CPU register) register mode (any CPU register) 


01 indirect mode (disp = 0, 1, !IRO,IR1) register mode (any CPU register) 
disp = 0, 1, IRO, IR1) 
disp = 0, 1, IRO, IR1) 


10 register mode (any CPU register) indirect mode 


( 
11 indirect mode (disp = 0, 1, !IRO,IR1) — indirect mode ( 


T  src1 addressing modes src2 addressing modes 
00 register mode (any CPU register) 8-bit signed immediate 


indirect mode *+ARn(5-bit unsigned 


01 register mode (any CPU register) displacement) 


indirect mode *+ARn(5-bit unsigned 
displacement) 
indirect mode *+ARn1(5-bit unsigned indirect mode *+ARn2(5-bit unsigned 
displacement) displacement) 


8-bit signed immediate 


sre? — src2 


The src2 operand is subtracted from the src? operand. The result is not loaded 
into any register. This allows for nondestructive compares. The src? and src2 
operands are assumed to be signed integers. Although this instruction has 
only two operands, it is designated as a three-operand instruction because op- 
erands are specified in the three-operand format. 


Status Bits 


Mode Bit 
Cycles 


Example 


Compare Integer, 3 Operands 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 


CMPI3 


14-85 


DBcond Decrement and Branch Conditionally (Standard) 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


14-86 


DBcond ARn, src 


src: _conditional-branch-addressing modes (B) 
ARn: auxiliary register 


31 15 87 0 


24 23 16 
01101 1] B] arn | 0| cond | register or displacement 


B src addressing modes 
0 register 
1 PC relative 


ARn - 1 — ARn 
If cond is true and ARn= 0: 
If src is in register-addressing mode (any register in CPU primary- 
register file), 
src > PC. 
If srcin PC-relative mode (label or address), displacement + PC + 1 — PC. 
Else, continue. 


DBcond signifies a standard branch that executes in four cycles because the 
pipeline must be flushed if condis true. If the condition is true and the specified 
auxiliary register is greater than or equal to 0, the specified auxiliary register 
is decremented and a branch is performed. 


The auxiliary register is treated as a 32-bit signed integer. Note that the branch 
condition does not depend on the auxiliary register decrement. 


If the src operand is expressed in register-addressing mode, the contents of 
the specified register are loaded into the PC. If the src operand is expressed 
in PC-relative addressing mode, the assembler generates a displacement: 
displacement = label — (PC of branch instruction + 1). This integer is stored as 
a 16-bit signed integer in the 16 least-significant bits of the branch instruction 
word. This displacement is added to the PC of the branch instruction plus 1 to 
generate the new PC. 


The ’C4x provides 20 condition codes that can be used with this instruction 
(see Section 11.2 for a list of condition mnemonics, encoding, and flags). 


Status Bits 


Mode Bit 
Cycles 


Example 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 


4 


DBLT AR3,R2 


Before Instruction 


PC 
AR3 
R2 
LuF[___ 


After Instruction 


PC 
AR3 
R2 
LuFL___ 0] 


Assembly Language Instructions 


Decrement and Branch Conditionally (Standard) DBcond 


14-87 


DBcondD Decrement and Branch Conditionally (Delayed) 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


14-88 


DBcondD ARn, src 


src: _ conditional-branch-addressing modes (B) 
ARn: auxiliary register 


31 15 87 0 


24 23 16 
01101 1] B] arn | 1| cond | register or displacement 


B src addressing modes 
0 register 
1 PC relative 


ARn - 1 — ARn 
If cond is true and ARN = 0: 
If src is in register addressing mode (any register in CPU primary- 
register file), 
src > PC 
If srcis in PC-relative mode (label or address) displacement+ PC +3— PC. 
Else, continue. 


DBconaD signifies a delayed branch that allows the three instructions after the 
delayed branch to be fetched before the PC is modified. The effect is a single- 
cycle branch. If the condition is true and the specified auxiliary register is great- 
er than or equal to zero, the specified auxiliary register is decremented and a 
branch is performed. (The three instructions following the DBconaD must not 
affect the cond). 


The auxiliary register is treated as a 32-bit signed integer. None of the three 
instructions following DBconaD should modify the program flow. Interrupts are 
disabled for the duration of the DBconaD instruction. Note that the branch con- 
dition does not depend on the auxiliary register decrement. 


If the src operand is expressed in register addressing mode, the contents of 
the specified register are loaded into the PC. Ifthe src is expressed in PC-rela- 
tive addressing, the assembler generates a displacement: displacement = la- 
bel — (PC of branch instruction + 3). This displacement is added to the PC of 
the branch instruction plus 3 to generate the new PC. Note that bit 21 = 1 for 
a delayed branch. 


The ’C4x provides 20 condition codes that can be used with this instruction 
(see Section 14.2 for a list of condition mnemonics, encoding, and flags). 


Status Bits 


Mode Bit 
Cycles 


Example 


Decrement and Branch Conditionally (Delayed) DBcondD 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


DBZD ARS, $+110h 


Before Instruction After Instruction 
PC PC 
ARS 7] ARS[______ 66h] 
lWF[_ LUF[_ 
vt wl 
UL UF [0 
N Lg N Lo] 
Z Z 
v Lo Ne [ie — 20] 
o. _—————= 9] ©. «| -———= 6] 


Assembly Language Instructions 


14-89 


FIX = Floating-Point to Integer Conversion 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-90 


FIX src, dst 


src: _general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


31 24 23 1615 87 0 


fo 00j001010| G| ast src 


G src addressing modes 
00 register (RO — R11) 

01 direct 

10 indirect 

11 immediate 


fix(src) > dst 


The floating-point operand src is converted to the nearest integer less than or 
equal to it in value, and the result is loaded into the dsf register. The src oper- 
and is assumed to be a floating-point number and the dstoperand is assumed 
to be a signed integer. 


The exponent field of the result register (if it has one) is not modified. 


Integer overflow occurs when the floating-point number is too large to be rep- 
resented as a 32-bit 2s-complement integer. In the case of integer overflow, 
the result is saturated in the direction of overflow. 


If ST (SET COND) = 0 and the destination register is RO—R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
Vv 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
1 


Floating-Point to Integer Conversion FIX 


Example FIX R1,R2 
Before Instruction After Instruction 
R1 1.3454e +3 R1 1.3454e +3 
R2 R2 1345 
LUF[__ LUFL__ 
vl wv Lo] 
UPL UF [_ 
N Lg N Lo] 
z Ld Z. ‘i; — — 70] 
v [Lo v [| 
c Lo c [ol 


Assembly Language Instructions 14-91 


FIX||STI Parallel FIX and STI 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-92 


FIX src2, dst1 
|| STI src3, dst2 


src2: indirect 
dst?1: register 
src3: register 
dst2: indirect 


— 


disp = 0, 1, IRO, IR1) 
RO — R7) 
RO — R7) 
disp = 0, 1, IRO, IR1) 


=~ ~~ oa~ 


31 24 23 1615 8 7 0 


None 


fix(src2) — dst? 
|| src3 > dst2 


A floating-point-to-integer conversion is performed. All registers are read at 
the beginning and loaded at the end of the execute cycle. This means that if 
one of the parallel operations (STI) reads from a register and the operation be- 
ing performed in parallel (FIX) writes to the same register, then STI accepts 
as input the contents of the register before it is modified by FIX. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


Integer overflow occurs when the floating-point number is too large to be rep- 
resented as a 32-bit 2s-complement integer. In the case of integer overflow, 
the result is saturated in the direction of overflow. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
Vv 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
1 


Example 


FIX *++AR4(1),R1 


|| STI RO, *AR2 
Before Instruction 

AR4|____80 98A2h| 
Ri 
RO 
AR2|___ 80 983Ch| 
Data at 80 98A3h 

Data at 80 983Ch 
lur-——~* 
vo J 
UPL 
a) 
z[ 
i nn) 
cL 


66 
220 


1.79750e + 02 


Parallel FIX and STI 


After Instruction 


AR4|____80 98A3h | 
Rt 
RO 
AR2[____80 983Ch| 


Data at 80 98A3h 
733 C000h 


Data at 80 983Ch 


ODCh 
Lur 9] 


Assembly Language Instructions 


FIX||STI 


179 
220 


1.79750e + 02 


220 


14-93 


FLOAT integer to Floating-Point Conversion 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-94 


FLOAT src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


24 23 1615 87 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


float (src) — dst 


The integer operand srcis converted to the floating-point value equal to it, and 
the result loaded into the dst register. The src operand is assumed to be a 
signed integer, and the dst operand is assumed to be a floating-point number. 


LUF Unaffected 
LV Unaffected 


UF 0 

N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
1 


Integer to Floating-Point Conversion FLOAT 


Example FLOAT *++AR2(2),R5 
Before Instruction After Instruction 

AR2|_____ 80 9800h] AR2|____ 80 9802h | 
R5 [| _034C 2000h] 1.27578125e + 01 R5 1.74e + 02 
Data at 80 9802h Data at 80 9802h 

OAEh 174 174 
LUFL__ LUFL__ 
wv Lo w Lo 
ur [_ ur [| 
n [Lo i it 7 
2 a ee | 
v Lo v1 ————5] 
c Lo or || 


Assembly Language Instructions 14-95 


FLOAT||STF Parallel! FLOAT and STF 


Syntax FLOAT src2, dst 
|| STF src3, dsi2 


Operands src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO — R7) 
src3: register (RO — R7) 
dst2: register (disp = 0, 1, IRO, IR1) 


Opcode 
31 24 23 1615 87 0 
Pr ifosors] ose |oool sos | die | soe | 
Operation float(src2 ) > dst1 
|| src3 > dst2 
Description An integer-to-floating-point conversion is performed. All registers are read at 
the beginning and loaded at the end of the execute cycle. This means that if 
one of the parallel operations (STF) reads from a register and the operation 
being performed in parallel (FLOAT) writes to the same register, then STF ac- 
cepts as input the contents of the register before it is modified by FLOAT. 
If src2. and dst2 point to the same location, src2is read before the write to dsi2. 
Status Bits LUF Unaffected 
LV __— Unaffected 
UF 0 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
Vv 0 
Cc Unaffected 
Mode Bit OVM operation is affected by OVM bit value. 
Cycles 1 


14-96 


Example 


FLOAT *+AR2(IRO),R6 
ll STF R7, *AR1 
Before Instruction 


AR2 
IRO 
R6 
R7 [| 034C20 0000h] 1.27578125e + 01 
ARt 


Data at 80 98CDh 
OAEh 174 


Data at 80 9933h 
Oh 


luF[__ 


Parallel FLOAT and STF_ FLOAT||STF 


After Instruction 


AR2 
IRO 
R6 1.740e + 02 
R7 1.27578125e + 01 
ARt 


Data at 80 98CDh 
174 
Data at 80 9933h 
1.27578125e + 01 
LUF[ 
CS eT] 
UF [ 
[i] 
2 —— = 
v [Lo] 
c [Ol 


Assembly Language Instructions 14-97 


FRIEEE Convert From IEEE Format 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-98 


FRIEEE src, dst 


src: direct- or indirect-addressing modes 
dst: extended-precision register (RO — R11) 


31 24 23 1615 87 0 


G src addressing modes 
01 direct 
10 indirect 


convert src from IEEE format > dst 


The src operand is converted from the IEEE floating-point format to the 2s- 
complement floating-point format. 


The src operand comes from memory. The converted result goes into an ex- 
tended precision register as a single-precision floating-point number. 


LUF Unaffected 
LV Set if overflow, otherwise unchanged 


UF 0 

N Sign of the result 

Z 1 if result is 0, 0 otherwise 
Vv 1 if overflow, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
1 


None 


Syntax 


Operands 


Opcode 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Parallel FRIEEE and STF_ FRIEEE||STF 


FRIEEE src2, dst1 
|| STF src3, dst2 


src2: indirect mode 
dst1: register mode 
src3: register mode 
dsi2: indirect mode 


— 


disp = 0, 1, IRO, IR1) 
RO — R7) 
RO — R7) 
disp = 0, 1, IRO, IR1) 


=~ —~ a 


31 24 23 7 0 


1615 8 


convert src2 from IEEE format — dst1 
in parallel with src3 > dst2 


The src2 operand is converted from the IEEE floating-point format to the 2s- 
complement format. The converted result goes into an extended-precision 
register dst? as a single-precision floating-point number. 


A floating-point store is done in parallel. 


If src2 and dst2 point to the same location, then src2 is read before the write 
to dsi2. 


LUF Unaffected 
LV Set if overflow, otherwise unchanged 


UF 0 

N Sign of the result 

Z 1 if result is 0, 0 otherwise 
V 1 if overflow, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-99 


IACK interrupt Acknowledge 


Syntax 
Operation 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-100 


IACK src 


src: general-addressing modes (G) 


31 24 23 1615 87 0 


looo|110110| G/00000 src 


G src addressing modes 
01 direct 
10 indirect 


Perform a dummy-read operation with IACK = 0. 
At end of dummy read, set IACK to 1. 


Adummy-read operation at address pointed by srcis performed with IACK =0. 
At the end of the dummy read, IACK is set to 1 if off-chip memory is speci- 
fied. This instruction can be used to generate an external-interrupt acknowl- 
edge. The IACK signal and the address can then be used to signal interrupt 
acknowledge to external devices. The data read by the processor is unused. 
Note that the IACK signal is extended with multicycle reads. 


LUF Unaffected 
LV Unaffected 


UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
1 


Interrupt Acknowledge IACK 


Example IACK *AR5 
Before Instruction After Instruction 
TACKL__ 1] TACKL__ 
PC PC 
wwF[_ | wF[_ i] 
vw Lo no 
uF [__ | uF [i] 
N Lo] N [Lo] 
2 i300] ra ees 03 
Vij == = 30) v [Lo] 
c [ol c [oO] 


Assembly Language Instructions 14-101 


IDLE = dle Until Interrupt 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-102 


IDLE 

None 

31 24 23 1615 87 0 
foo}o0 110 0/00000000000000000000000 
None 

1 > ST(GIE) 

Next PC > PC 


Idle until interrupt 


The global-interrupt enable bit is set, the next PC value is loaded into the PC, 
and the CPU idles until an unmasked interrupt is received. When the interrupt 
is received, the contents of the PC are pushed onto the active system stack, 
and the processor jumps to execute the interrupt service routine. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
1 


None 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Idle Until Interrupt2 IDLE2 


IDLE2 (C40 revision > 5.0 and ’C44 only) 

None 

31 24 23 1615 87 0 
foo 004100 DO0DD00000000DOD0ONDDD 
None 

1 > ST(GIE) 

Next PC + PC 


Idle until interrupt 


The IDLE2 instruction performs the same function as IDLE, except that it re- 
moves the functional clock input from the internal device. This allows for an 
extremely low-power mode. The PC is incremented once, and the device re- 
mains in an idle state until one of the external interrupts (NMI or ITIOFx) is as- 
serted. 


In IDLE2 mode, the ’C4x behaves as follows: 
1 The CPU, peripherals, and memory retain their previous states. 


Lj When the device is in the functional (nonemulation) mode, the clocks stop 
with H1 high and H8 low. 


CV The ’C4x remains in IDLE2 until one of the external interrupts (NMI or 
IIOFx) is asserted for at least two H1 clock cycles. Then, the clocks start 
after a delay of one H1 cycle. The clocks can start up in the phase opposite 
that in which they were stopped (that is, H1 might start high when H3 was 
high before stopping, and H3 might start high when H1 was high before 
stopping). However, the H1 and H3 clocks remain 180° out of phase with 
each other. 


(J During IDLE2 operation, for one of the external interrupts to be recognized 
and serviced by the CPU, it must be asserted for at least two H1 cycles. 
For the processor to recognize only one interrupt when it restarts opera- 
tion, the interrupt pin must be configured for edge-triggered mode or as- 
serted for less than three cycles in level-triggered mode. 


(1 When the ’C4x is in the emulation mode, the H1 and H3 clocks continue 
to run normally, and the CPU operates as if an IDLE instruction had been 
executed. The clocks continue to run for correct operation of the emulator. 


Lj Any external interrupt pin can wake up the device from IDLE2; but for the 
CPU to recognize that interrupt, the interrupt must also be enabled. If an 
interrupt is recognized and executed by the CPU, the instruction following 
the IDLE2 instruction is not executed until after a return is executed. 


Assembly Language Instructions 14-103 


IDLE2 (dle Until interrupt 2 


Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example None 


14-104 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Link and Jump LAJ 


LAJ src 


src: in PC-relative mode 


31 24 23 1615 87 0 
None 


PC of LAJ + 4 — extended-precision register R11 
displacement + 3 + PC of LAJ — PC 


LAJ performs a single-cycle delayed subroutine call that allows the three in- 
structions following the LAd instruction to be performed before branching. The 
return address (address of the LAd instruction + 4) is placed in extended-preci- 
sion register R11. The assembler generates a displacement: displacement = 
src — (PC of branch instruction + 1). This displacement is stored as a 24-bit 
signed integer in the 24 least significant bits of the branch instruction. This dis- 
placement is added to the PC of the branch instruction plus 1 to generate the 
new PC. See Section 6.6 on page 6-19 for details. 


None of the three instructions following the LAJ instruction should 
modify R11 or the program flow. Interrupts are disabled for the duration of 
the LAJ instruction. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-105 


LAJcond Link and Jump Conditionally 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-106 


LAJcond src 


src: _conditional-branch addressing modes 


31 24 23 1615 8 7 0 


01110 0/ B|0 00 11 cond register or displacement 


B src addressing modes 
0 register 
1 PC relative 


If (cond is true) 
If (src is a register) 
PC of LAJcond + 4 — extended-precision register R11 
src > PC 
If (src is in PC-relative mode) 
PC of LAJcond + 4 — extended-precision register R11 
displacement = src —(PC of LAJ +3) 
displacement + PC of the LAJ + 3 — PC 
Else, continue. 


LAJcond performs a conditional single-cycle delayed subroutine call that al- 
lows the three instructions following the LAJcond instruction to be performed 
before branching, without affecting the cond. The return address (address of 
the LAd instruction + 4) is placed in extended-precision register R11. The ad- 
dress branched to is formed by either register mode or PC-relative mode. 


None of the three instructions following the LAJcond instruction should 
modify R11 or the program flow. Interrupts are disabled for the duration of 
the LAJcond instruction. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


None 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Link and Trap Conditionally LATcond 


LATcond N 


N immediate mode — trap number (0 < N < 511) 


31 24 23 1615 87 0 
01110100 14/00| cond |0000000 N 
None 


If (cond is true) 

ST(GIE) > ST(PGIE) 

ST(CF) > ST(PCF) 

0 > ST(GIE) 

1 > ST(CF) 

PC of LAcond + 4 > extended-precision register R11 trap vector N > PC 
Else, continue. 


The LATcond instruction performs a conditional delayed single-cycle trap. If 
the condition is true, ST bits GIE and CF are saved in PGIE and PCF in the 
status register. Then all interrupts are disabled (0 — GIE), and the cache is fro- 
zen (1 > CF). The contents of the PC of the LATcond + 4 are placed in R11, 
and the PC is loaded with the contents of the specified trap vector (N). If the 
condition is not true, then continue normal operation. If traps are to be nested, 
you may need to save the status register before executing LATcond. 


The three instructions following LATcond are fetched and executed, but they 
do not affect the cond. They should not modify the program flow or directly 
modify the status register. Interrupts are disabled for the duration of the 
LATcond N instruction. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-107 


LBb Load Byte 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-108 


LBb src, dst 


src: register, direct, 16-bit immediate, or indirect-addressing modes 
dst: register mode (any register in CPU primary-register file) 


24 23 1615 8 


31 7 0 
formoogelolm@ | S—~™ 


G__s src addressing modes B_ srcbyte 

00 =register mode 00 byte OLS byte 
01 direct mode 01 byte 1 

10 indirect mode 10 byte 2 

11. immediate mode (16 bits) 11. byte 3 MS byte 


Sign-extended byte (3, 2, 1, 0) of src — dst 
b = byte to load (8, 2, 1, 0) 


| 3 | 2 | 1 | 0 | =b (byte designator 3 — 0) 


The specified byte of the src operand is sign-extended and right-shifted into 
the eight LSBs of the dst register. The src byte is signed. When immediate 
mode is specified and byte 2 (B =10) or byte 3 (B =10) is selected, the LBb 
instruction performs sign extension of the 16-bit value. Consequently, the val- 
ue of 00h or FFh is stored into the eight LSBs of the dst register. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Load Byte LBb 


Example LB2 Rl, R2 ; sign extended byte 2 of R1 >R2 
Before Instruction After Instruction 

Rt Rt 

Re R2 


Assembly Language Instructions 14-109 


LBUb Load Byte Unsigned 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-110 


LBUb src, dst 


src: register, direct, 16-bit immediate, or indirect-addressing modes 
dst: _ register mode (any register in CPU primary-register file) 


24 23 1615 87 0 


31 
formoonefo> am | S~ 


G__ssrcaddressing modes B_ srcbyte 

00 register mode 00 byte OLS byte 
01 direct mode 01 byte 1 

10 indirect mode 10 byte 2 

11. immediate mode (16 bits) 11. byte 3 MS byte 


Byte (3, 2, 1, 0) of src > ast 
b = byte to load (8, 2, 1, 0) 


| 3 | 2 | 1 | 0 | =b (byte designator 3 — 0) 


The specified byte of the src operand is right-shifted, without sign extension, 
into the eight LSBs of the dstregister. The srcbyte is unsigned. When immedi- 
ate mode is specified and byte 2 (B =10) or byte 3 (B =10) is selected, the LBUb 
instruction performs sign extension of the 16-bit value. Consequently, the val- 
ue of 00h or FFh is stored into the eight LSBs of the dst register. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N 0 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 

Cc Unaffected 


OVM operation is not affected by OVM bit value. 
1 


LBU2 RI, R2 


Before Instruction After Instruction 
R1 OOAB 0000h R1 OOAB 0000h 
R2 0000 0000h R2 0000 OOABh 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 
Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Load Address Register LDA 


LDA _ src, dst 


src: general-addressing modes 
dst. register mode (address registers only) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


src > dst 


The src operand is loaded into the dst register. The dst register can be any of 
the address registers: ARO — AR7, IRO, IR1, DP, BK, or SP. The load is com- 
plete by the end of the read phase of the pipeline. As a result, LDA is one cycle 
faster than LDI for loading these registers. (All operands are treated as signed 
integers.) 


The src and dst operands cannot be the same register. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-111 


LDE Load Floating-Point Exponent 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-112 


LDE src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


24 23 1615 87 0 


31 


G src addressing modes 
00 register (RO — R11) 

01 direct 

10 indirect 

11 immediate 


src(exp) > dst(exp) 


The exponent field of the src operand is loaded into the exponent field of the 
dst register. No modification of the dst register mantissa field is made unless 
the value of the exponent loaded is the reserved value of the exponent for zero 
as determined by the precision of the src operand. Then, the mantissa field of 
the dst register is set to 0. The src and dst operands are assumed to be floa- 
ting-point numbers. Immediate values are evaluated in the short floating-point 
format. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


Load Floating-Point Exponent LDE 


Example LDE RO,R5 
Before Instruction After Instruction 


RO 020005 6F30h} 4.00066337e + 00 RO {_020005 6F30h | 4.00066337e + 00 
R5 |_OAOS6F E332h] 1.06749648e + 03 R5 [|_02056F E332h | 4.16990814e + 00 


LuF[__ LuF[__ 
wv Co w Co 
UF Le UF [0 
Nn [0] N [] 
Z.,*(-—____=9)| 2? [0] 
v 0] v [Lo] 
c [0] c [Lo ] 


Assembly Language Instructions 14-113 


LDEP Load Integer From Expansion Register File to Primary Register File 


Syntax LDEP src, dst 


Operands src: expansion register file register (IVTP or TVTP) 
dst: _ register mode (any register in CPU primary register file) 


Opcode 
31 24 23 1615 87 0 


011101100/00] at |0000000000 src 


Word Fields None 
Operation src > dst 


Description The LDEP instruction loads a CPU register with the contents of the IVTP regis- 
ter (interrupt-trap table pointer) or the TVTP register. These registers are de- 
scribed in Section 3.2. 


The src operand register from the expansion-register file is loaded into the dst 
register in the primary register file. The dst register content is assumed to be 
an integer. 


Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
Vv Unaffected 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example None 


14-114 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Load Floating-Point Value LDF 


LDF src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


31 24 23 1615 8 7 0 


G src addressing modes 
00 register (RO — R11) 

01 direct 

10 indirect 

11 immediate 


src > dst 


The src operand is loaded into the dst register. The dst and src operands are 
assumed to be floating-point numbers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N 1 if a negative result is loaded, 0 otherwise 
Z 1 if a zero result is loaded, 0 otherwise 

Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-115 


LDF Load Floating-Point Value 


Example LDF @9800h,R2 
Before Instruction After Instruction 

DP DP 
R2 R2 [ 010C52 A000h | 2.19254303e + 00 
Data at 80 9800h Data at 80 9800h 

2.19254303e + 00 | 10C5 2A00h | 2.19254303e + 00 
LUF[ LuF[_ 
vf i 
UF Lo a 
Ne [= 20) a 
z fC z [] 
ve 0] a || 
c Lo c Lo] 


14-116 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Load Floating-Point Value Conditionally (LDFcond 


LDFcond src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


24 23 1615 87 0 


31 


G src addressing modes 
00 register (RO — R11) 

01 direct 

10 indirect 

11 immediate 


If cond is true: 
src > dst. 
Else: 
dst is unchanged. 


If the condition is true, the src operand is loaded into the dst register. Other- 
wise, the dst register is unchanged. The dst and src operands are assumed 
to be floating-point numbers. 


The ’C4x provides 20 condition codes that can be used with this instruction 
(see Section 14.2 on page 14-12 for a list of condition mnemonics, encoding, 
and flags). Note that an LDFU (load floating-point unconditionally) instruction 
is useful for loading RO — R11 without affecting condition flags. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 


1 


Assembly Language Instructions 14-117 


LDFcond Load Floating-Point Value Conditionally 


Example LDFZ R3,R5 
Before Instruction After Instruction 


R3 [2CFF2C D500h]} 1.77055560e +13 R3 [2CFF2C D500h} 1.77055560e +13 
RS 5F0000 003Eh]} 3.96140824e +28 R5 |2CFF2C D500h| 1.77055560e +13 


LuFL__ LuF[_ 
wv Lo w Co] 
UF [0 UF [LO] 
N [0] N [0] 
Zl Zi, == = 4] 
v ~d Vr al —— = 3] 
c [To c [o] 


14-118 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Load Floating-Point Value, Interlocked LLDFI 


LDFI src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


24 23 1615 8 7 0 


31 


src addressing 


G modes 
01 direct 
10 indirect 


Signal interlocked operation. 
src > dst 


The src operand is loaded into the dstregister. An interlocked operation is sig- 
naled over LOCK or LLOCK. The srcand dstoperands are assumed to be floa- 
ting-point numbers. Only direct and indirect modes are allowed. Refer to Sec- 
tion 9.7 (page 9-39) for a detailed description. 


LUF Unaffected 
LV Unaffected 


UF 0 

N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-119 


LDFI Load Floating-Point Value, Interlocked 


Example 


14-120 


LDFI *+AR2,R7 


Before Instruction 


AR2 8098F 1h 
R7 


Data at 80 98F2h 


584 Co00h 
Lue 


-6.28125e + 01 


After Instruction 


AR2|____8098F 1h] 
R7 
Data at 80 98F2h 

LUFL__ 
WE = ——— 0) 
UE 0] 
Ne = — 3G] 
ZL) 
v Lo] 
c Lo] 


—6.28125e + 01 


-6.28125e + 01 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel LDF andLDF_ LDF||LDF 


LDF src2, dst2 
|| LDF srci, dst1 


srce7: indirect 


— 


disp = 0, 1, IRO, IR1) 


dst1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dsi2: register (RO — R7) 
31 24 23 1615 87 0 
foooro] we] aw ooo] om | a2 
None 
src2 > dst2 
|| src1 > dst? 


Two floating-point loads are performed in parallel. If the LDFs load the same 
register, the assembler issues a warning. The result is that of LDF src2, dst2. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Zz Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-121 


LDF||LDF Parallel LDF and LDF 


Example 


14-122 


LDF *-—-—AR1(IRO),R7 


|| LDF *AR7++(1),R3 
Before Instruction 


ARI 
IRO 
R7 
AR7|____ 80 988Ah| 
R3 


Data at 80 9857h 


70C 8000h 


Data at 80 988Ah 
57B 4000h 


LuFL__ 


1.4050e + 02 


6.281250e + 01 


After Instruction 


AR1[____ 80 9857h| 
IRO 
R7 
AR7|___80 988Bh| 
R3 


Data at 80 9857h 


70C 8000h 


Data at 80 988Ah 


57B 4000h 
LuFL__ 


1.4050e + 02 


6.281250e + 01 


1.4050e + 02 


6.281250e + 01 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel LDF and STF_ ~LDF||STF 


LDF src2, dst? 
|| STF src3, dsi2 


src2: 
dstt: 
src3: 
dsi2: 


3 


1 24 23 1615 8 


indirect (disp = 0, 1, IRO, IR1) 
register (RO — R7) 
register (RO — R7) 
indirect (disp = 0, 1, IRO, IR1) 


v4 


0 


None 


src2 > dst1 
|| src3 > dst2 


A floating-point load and a floating-point store are performed in parallel. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


LUF 
LV 
UF 
N 

Z 

Vv 

Cc 


Unaffected 
Unaffected 
Unaffected 
Unaffected 
Unaffected 
Unaffected 
Unaffected 


OVM operation is not affected by OVM bit value. 


1 


Assembly Language Instructions 


14-123 


LDF||STF Parallel LDF and STF 


Example 


14-124 


LDF *AR2--(1),R1 
|| STF R3,*AR4++(IR1) 
Before Instruction 


AR2 
RI 
R3 6.28125e +01 
AR4 
IR1 


Data at 80 98E7h 


70C 8000h 


Data at 80 9900h 


1.4050e + 02 


Oh 


LuFL__ 


After Instruction 


AR2 80 98E6h 
R1 070C80 0000h 
R3 [_057B40 0000h 


AR4[____ 80 9910h| 
IR1 
Data at 80 98E7h 

Data at 80 9900h 

LUF[_ 
i a) 
UF LO] 
N [0] 
z[ | 
v [Lo] 
c [Lo] 


1.4050e + 02 
6.28125e + 01 


1.4050e + 02 


6.28125e + 01 


Syntax 
Operands 


Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Load 16 MSBs With 16-Bit Immediate LDHI 


LDHI src, dst 


src: 16-bit unsigned immediate 
dst: register mode 


31 24 23 1615 8 7 0 
60 oH oe am src (immediate value) 
None 


src— 16 MSBs of dst 


The 16-bit unsigned srcimmediate value is loaded into the 16 MSBs of the dst 
register, and 0 is loaded into the 16 LSBs of the dst register. The dst register 
is assumed to be an integer. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Zz Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


LDHI 44h, R2 


Before Instruction After Instruction 
R2 ABCD EF12h R2 0044 0000h 


Assembly Language Instructions 14-125 


LDI = Load Integer 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-126 


LDI src, dst 


src: general-addressing modes (G) 
dst: _ register (any register in CPU primary-register file) 


31 24 23 1615 87 0 


None 
G src addressing modes 
00 register (any register in 

CPU primary-register file) 

01 direct 
10 indirect 
11 immediate 

src > dst 


The src operand is loaded into the dst register. The dst and src operands are 
assumed to be signed integers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
1 


Example 


LDI *-AR1(IRO),R5 
Before Instruction 


AR1 
IRO 
R5 3C5h 


Data at 27h 


luF[_ 


965 


38 


Load Integer 


After Instruction 


ARI 
IRO 
R5 


Data at 27h 
26h 


LuFL__ 


Assembly Language Instructions 


LDI 


38 


38 


14-127 


LDicond Load Integer Conditionally 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


14-128 


LDlicond src, dst 


src: general addressing modes (G) 
dst: register (any register in CPU primary register file) 


24 23 1615 87 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary register file) 

01 direct 

10 indirect 

11 immediate 


If cond is true: 
src > dst, 
Else: dst is unchanged. 


If the condition is true, the src operand is loaded into the dst register. Other- 
wise, the dst register is unchanged. The dst and src operands are assumed 
to be signed integers. 


LDP (an alternate form of LDIU) loads the data-page pointer register (DP) or 
any other register with the 16 MSBs of a relocatable address. 


The ’C4x provides 20 condition codes that can be used with this instruction 
(see Section 14.2 for a list of condition mnemonics, encoding, and flags). Note 
that a load integer unconditionally (LDIU) instruction is useful for loading a se- 
lected CPU register without affecting the condition flags that the LDI instruction 
affects. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


Load Integer Conditionally LDIcond 


Example LDIZ R4,R6 
Before Instruction After Instruction 
R4 636 R4 636 
R6 4066 R6 4066 
LuFL__ LUFL__ 
vf wv Lo 
UPL UF [0 
N [Of N [0] 
Zz (-— 9] z C9 
v [9] YC a) 
¢ [— 9 c Lo ] 


Assembly Language Instructions 14-129 


LDII = Load Integer, Interlocked 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-130 


LDII src, dst 


src: general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


31 15 87 0 


24 23 16 


G src addressing modes 
01 direct 
10 indirect 


Signal interlocked operation. 
src > dst 


The src operand is loaded into the dstregister. An interlocked operation is sig- 
naled over LOCK or LLOCK. The src and dst operands are assumed to be 
signed integers. Note that only the direct and indirect modes are allowed. Re- 
fer to Section 9.7 on page 9-39 for a detailed description. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Example 


LDII @985Fh,R3 


Before Instruction 


cP [80 
Rs Lon] 


Data at 80 985Fh 


ODCh 
a 


Load Integer, Interlocked LDIl 


After Instruction 


DP 
R3 ODCh 


Data at 80 98F5h 


ODCh 
lr 9] 


Assembly Language Instructions 


14-131 


LDI||LDI = Parallel LDI and LDI 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-132 


LDI src2, dsi2 
|| LDI src7, dst? 


srce1: indirect 


— 


disp = 0, 1, IRO, IR1) 


dst1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst2: register (RO — R7) 
31 24 23 1615 87 0 
ooo rs] ae] a [ooo] i | a 
None 
src2 > dst2 
|| src1 > dst1 


Two integer loads are performed in parallel. A warning is issued by the assem- 
bler if the LDIs load the same register. The result is that of LDI src2, dst2. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


Example 


LDI *-AR1(1),R7 


LDI *AR7++(IRO) 
Before Instruction 


AR1 
R7 
AR7|___80 98C8h| 
IRO 


R1 


[| Oh] 


Data at 80 9825h 


OFAh 


Data at 80 98C8h 


2EEh 


LuFL__ 


mise 


250 


750 


Parallel LDI and LDI 


LDI||LDI 


After Instruction 


AR1 
R7 
AR7|___80 98D8h | 
IRO 
Rt 


Data at 80 9825h 


OFAh 


Data at 80 98C8h 
2EEh 


LuFL__ | 


Assembly Language Instructions 


250 


750 


250 


750 


14-133 


LDI||STI = Parallel LDI and STI 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-134 


LDI src2, dst1 
|| STI src3, dst2 


src2: indirect 
dst?1: register 
src3: register 
dst2: indirect 


— 


disp = 0, 1, IRO, IR1 
RO — R7) 
RO — R7) 
disp = 0, 1, IRO, IR1) 


=~ ~~ oa~ 


31 24 23 1615 8 7 0 


None 


src2 — dst1 
|| src3 > dst2 


An integer load and an integer store are performed in parallel. If src2 and dst2 
point to the same location, src2 is read before the write to dsi2. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


Example 


LDI *-AR1(1),R2 
|| STI R7, *AR5++(IRO) 
Before Instruction 


AR1 
R2 
R7 
AR5[____ 80 982Ch] 
IRO 


Data at 80 98E6h 
ODCh 


Data at 80 982Ch 
Oh 


LuF[__ 


53 


220 


Parallel LDI and ST! 


After Instruction 


AR1 
R2 
R7 
AR5[____80 9834h| 
IRO 


Data at 80 98E6h 
ODCh 


Data at 80 982Ch 
35h 


LuFL__ 


Assembly Language Instructions 


LDI||STI 


220 
53 


220 


53 


14-135 


LDM Load Floating-Point Mantissa 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-136 


LDM src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


31 24 23 1615 87 0 


G src addressing modes 
00 register (RO — R11) 

01 direct 

10 indirect 

11 immediate 


src (man) — dst (man) 


The mantissa field of the src operand is loaded into the mantissa field of the 
dst register. The dst exponent field is not modified. The src and dst operands 
are assumed to be floating-point numbers. If immediate addressing mode is 
used, bits 15 —12 of the instruction word are forced to 0 by the assembler. If 
the source is in the memory, the 32-bit data are loaded into the mantissa field. 


LUF Unaffected 
LV Unaffected 


UF Unaffected 
N Unaffected 
Z Unaffected 
Vv Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
1 


Load Floating-Point Mantissa LDM 


Example LDM 156.75,R2 (156.75 = 07 1CC0O 0000h) 
Before Instruction After Instruction 
R2 R2 [00 1CC0 0000h | 1.22460938e + 00 
LuF[_ LuF[___ 
vf iv Lo] 
UP [___ UF [Lo] 
N [0] N [0] 
z 0] z CL] 
v ~ Oo Mice = 30) 
c [0] c Lo] 
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LDP Load Data Page Pointer 


Syntax 
Operands 


Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-138 


LDP src/,DP] 


src: 16 MSBs of the absolute 32-bit source address (src). 
dst: optional (data-page pointer understood if “,DP” left out of operand) 


31 24 23 1615 87 0 
0101/00000/11]110000 src 


None 
src — Data-page pointer 


This pseudo-op is an alternate form of the LDIU instruction, except that LDP 
is always in the immediate addressing mode (bits 22 —21 = 115). The16 MSBs 
of the src absolute 32-bit value (note that an srcless than 32 bits is zero filled 
to make the 32 bits) are loaded into the 16 LSBs of the data-page pointer. 


The 16 LSBs of the pointer are used in direct addressing as a pointer to the 
page of data being addressed. There is a total of 64K pages, each page 64K 
words long. Bits 31—16 of the pointer are reserved and should be kept to zero. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
Vv Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


LDP @809900h, DP 


or 
LDP @809900h 
Before Instruction After Instruction 
DP DP [___0080h] 16MSBs of 32-bit 
src, zeros extended 
LuEL____o] LtuFL__ 
w Ld] w La] 
ur L_____] peo = —— 9] 
n Ld n Lo] 
2. = —— 4] 2 == —— 7] 
v Lo v Lol 
é. | 9] er 


Syntax 
Operands 


Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Load Integer From Primary Register File to Expansion Register File LDPE 


LDPE src, dst 


src: register mode (any register in CPU primary-register file) 
dst. expansion-register file register (IVTP or TVTP) 


31 24 23 1615 87 0 
o11101101/00] at |oo000000000| se | 
None 

src— dst 


This is ameans to load the interrupt vector table pointer (IVTP) register or trap- 
vector table pointer (TVTP) register. These registers are described in Section 
3.2 on page 3-17. 


The src operand register from the primary-register file is loaded into the dst 
register in the expansion register file. The dst operand is assumed to be an 
integer. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Zz Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


LDPE ARO, TVTP ; set trap-vector pointer 
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LDPK Load Data-Page Pointer Immediate 


Syntax 
Operands 
Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 


Cycles 
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LDPK src 


src: 16-bit unsigned immediate 


31 24 23 1615 87 0 
foool11111011/10000 src 

None 

src > DP 


The 16-bit unsigned immediate value is loaded into the DP register. This oper- 
ation is completed by the end of the decode phase of the LDPK instruction; 
thus, the value loaded is ready for the next instruction for immediate address- 
ing. Use caution when using the DP register in the instruction that precedes 
the LDPK. For example: 


PUSH DP 
LDPK new_value 


pushes the DP new value into the stack instead of saving the old DP value. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Load Half-Word LHw 


LHw src, dst 


src: register, direct, 16-bit immediate, or indirect-addressing modes 
dst: register mode (any register in CPU primary-register file) 


3 24 23 1615 87 0 
src 


, 
101141401 0/H|G|_ ast 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 

H src half-word 

0 half-word 0 (LS half-word) 

1 half-word 1 (MS half-word) 


Sign-extended half-word (0, 1) of src > dst 
w = half-word to load (0, 1) 


| td] = wadessignator 


The specified half-word of the src operand is sign-extended and right-shifted 
into the 16 LSBs of the dstregister. The src half-word is signed. When immedi- 
ate mode is specified and a half-word 1 (H = 1) is selected, the LHw instruction 
performs sign extension of the 16-bit value into a 32-bit value. Consequently, 
the corresponding half-word value (0000h or FFFFh) is stored into the 16 LSBs 
of the dst register. 
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LHw Load Half-Word 


Status Bits If ST (SET COND) = 0 and the destination register is RO—R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example LHO Rl, R2 
Before Instruction After Instruction 
Ri Ri 
R2 R2 
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Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Load Half-Word Unsigned LHUw 


LHUw src, dst 


src: register, direct, 16-bit immediate-, or indirect- addressing modes 
dst: register mode (any register in CPU primary-register file) 


3 24 23 1615 8 7 0 


| 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 

H src half-word 

0 half-word 0 (LS half-word) 

1 half-word 1 (MS half-word) 


Unsigned half-word (0, 1) of src > dst 


w = half-word to load (0, 1) 


| i | | = waesignator 


The specified half-word of the src operand is unsigned and right-shifted into 
the 16 LSBs of the dstregister. The srchalf-word is unsigned. When immediate 
mode is specified and a half-word 1 (H = 1) is selected, the LHw instruction 
performs sign extension of the 16-bit value into a 32-bit value. Consequently, 
the corresponding half-word value (0000h or FFFFh) is stored into the 16 LSBs 
of the dst register. 
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LHUw sLoad Half-Word Unsigned 


Status Bits If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 
N 0 
Z 1 if a zero result is generated, 0 otherwise 
V 0 
Cc Unaffected 
Cycles 1 
Mode Bit OVM operation is not affected by OVM bit value. 
Example LHUO Rl, R2 
Before Instruction After Instruction 
Ri RI 
R2 R2 
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Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Logical Shift LSH 


LSH src_count, dst 


src_count: general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


31 24 23 1615 8 7 0 


10 0 0) 01001%1/ G dst src_count 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


count = 7 LSBs of src_count 
If count= 0: 

dst << count > dst 
Else: 

dst >> |count | > dst 


The seven LSBs of the src_countoperand constitute the 2s-complement-shift 
count. |If countis greater than 0, the dst operand is left-shifted by the value of 
count. Low-order bits shifted in are zero-filled, and high-order bits are shifted 
out through the C (carry) bit. 


Logical left-shift: 
Cedstc0 


If countis less than 0, the dstis right-shifted by the absolute value of the count 
operand. The high-order bits of the dst operand are zero-filled as they are 
shifted to the right. Low-order bits are shifted out through the C (carry) bit. 


Logical right-shift: 

0-> dst>C 

If count is 0, no shift is performed, and the C (carry) bit is cleared to 0. 

If count is greater than 32, the C (carry) bit gets the LSB. If countis less than 


-32, the C bit is cleared to 0. 
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LSH Logical Shift 


The src_countoperand is assumed to be a signed integer, and the dstoperand 
is assumed to be an unsigned integer. 


Status Bits If ST (SET COND) = 0 and the destination register is RO—R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 

LUF Unaffected 

LV Unaffected 

UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 

Vv 0 

Cc Set to the value of the last bit shifted out. 0 for a shift count of 0 

Mode Bit OVM operation is not affected by OVM bit value. 

Cycles 1 

Example 1 LSH R4,R7 

Before Instruction After Instruction 

R4 24 R4 24 
R7 R7 
LUFL__ LuF[_ 

v Lo 2 ne 7 

UF [0 UF [0 

N [9] N 
zC_sid Z- |---| 

Ve — 9) V 

c [LO c [Lo] 

Example 2 LSH *-AR5(IRO),R5 
Before Instruction After Instruction 

AR5 ARS 

IRO IRO 

R5 R5 

Data at 80 9904h Data at 80 9904h 

“12 “12 

wr Lr 

vf sid WwW [9] 

ur [9] ur [9] 
nC gj C7 

z fg z fo 
vo 9 v— oO 
cfg c fo 
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Logical Shift, 3 Operands LSH3 


src, stc_count: both type 1 or type 2 three-operand addressing modes 


register mode (any register in CPU primary-register file) 


Syntax LSHS3 src_count, sre, dst 
Operands 
ast: 
Opcode 
Type 1 
31 24 23 
Type 2 
31 24 23 
Word Fields 
Type 1 
T  srcaddressing modes 
00 register mode (any CPU register) 
01 indirect mode (disp = 0, 1, IRO, IR1) 
10 register mode (any CPU register) 
11 indirect mode (disp = 0, 1, IRO, IR1) 
Type 2 
T  srcaddressing modes 
00 register mode (any CPU register) 
01 register mode (any CPU register) 
10 indirect mode *+ARn(5-bit unsigned 
displacement) 
11 indirect mode *+ARn1(5-bit unsigned 
displacement) 
Operation count = 7 LSBs of src_count 


If count = 0: 
src << count > dst 


Else: 


src >> |count | > dst 


Assembly Language Instructions 


1615 


1615 


sic src_count 


87 0 
src 


src_count addressing modes 
register mode (any CPU register) 
register mode (any CPU register) 
disp = 0, 1, IRO, IR1) 
disp = 0, 1, IRO, IR1) 


indirect mode ( 
( 


indirect mode 


src_count addressing modes 
8-bit signed immediate 


indirect mode *+ARn(5-bit unsigned 
displacement) 


8-bit signed immediate 


indirect mode *+ARn2(5-bit unsigned 
displacement) 
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LSH3 Logical Shift, 3 Operands 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 
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The seven LSBs of the src_count operand constitute the 2s-complement shift 
count. 


If countis greater than 0, a copy of the src operand is left-shifted by the value 
of count, and the result is written to the dst (the srcis not changed). Low-order 
bits shifted in are zero-filled, and high-order bits are shifted out through the C 
(carry) bit. 


Logical left-shift: 
Ce sc¢od 


If countis less than 0, the src operand is right-shifted by the absolute value of 
count. The high-order bits of the dst operand are zero-filled as shifted to the 
right. Low-order bits are shifted out through the C (carry) bit. 


Logical right-shift: 
0 > src> C 
If count is 0, no shift is performed and the C (carry) bit is set to 0. 


If countis greater than 32, the carry (C) bit is set to the LSB. If count is less 
than 32, the carry bit is cleared to 0. This also applies to LSH. 


The src_countoperand is assumed to be a signed integer. The src and dstop- 
erands are assumed to be unsigned integers. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Set to the value of the last bit shifted out. 0 for a shift count of 0 
OVM operation is not affected by OVM bit value. 
{ 


None 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Parallel LSH3 and ST|_ LSH3||STI 


LSH3_ src_count, src2, dst1 
|| STI src3, dst2 


src_count: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO —R7) 

src3: register (RO — R7) 

dsi2: indirect (disp = 0, 1, IRO, IR1) 


3 24 23 1615 8 7 0 


1 
01110 src_count|  src3 dst2 src2 


None 


count = 7 LSBs of src_count 
If count= 0: 

src2 << count > dst1 
Else: 

src2 >> |count | > dst1 

|| src3 > dst2 


The seven LSBs of the src_count operand constitute the 2s-complement shift 
count. 


If countis greater than 0, a copy of the src2 operand is left-shifted by the value 
of count and the result is written to dst7 (src2 is not changed). Low-order bits 
shifted in are zero-filled, and high-order bits are shifted out through the C 
(carry) bit. 


Logical left-shift: 
Ce src2¢0 


If countis less than 0, a copy of the src2 operand is right-shifted by the absolute 
value of count. The high-order bits of the dst operand are zero-filled as shifted 
to the right. Low-order bits are shifted out through the C (carry bit). 


Logical right-shift: 
0-> src2>C 
If count is 0, no shift is performed and the carry bit is set to 0. 


The src_count operand is assumed to be a signed integer, and the src2 and 
dst? operands are assumed to be unsigned integers. All registers are read at 
the beginning and loaded at the end of the execute cycle. This means that if 
one of the parallel operations (STI) reads from a register and the operation be- 
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LSH3||STI_ Parallel LSH3 and STI 


ing performed in parallel (LSH3) writes to the same register, then STI accepts 
as input the contents of the register before it is modified by the LSH3. 


If src2 and dst2 point to the same location, src2is read before the write to dsi2. 


Status Bits LUF Unaffected 
LV Unaffected 
UF 0 
N MSB of the output 
Z 1 if a zero output is generated, 0 otherwise 
Vv 0 
Cc Set to the value of the last bit shifted out. 0 for a shift count of 0 
Mode Bit OVM operation is affected by OVM bit value. 
Cycles 1 
Example 1 LSH3 R2,*++AR3(1),RO 
I STI R4, *-AR5 
Before Instruction After Instruction 
R2 24 R2 24 
AR3 AR3 
RO RO 
R4 220 R4 220 
ARS ARS 
Data at 80 98C3h Data at 80 98C3h 
0ACh OACh 
Data at 80 98A2h Data at 80 98A2h 
Oh ODCh 220 
LUFL_ LUF[_ 
v Lo wv [ol 
UF [_ UF [OO 
N Lj N 
z [i z [| 
v [Lo v [| 
c Lo c [ol 
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Parallel LSH3 and ST|_ LSH3||STI 


Example 2 LSH3 R7, *AR2--(1),R2 
ll STI RO, *+ARO (1) 


Before Instruction After Instruction 


R7 -12 R7 Hie 
AR2|_____ 80 9863h] AR2|____ 80 9862h | 
R2 R2 
RO 300 RO 300 
ARO[____ 80 98B7h| ARO[___ 80 98B7h | 
Data at 80 9863h Data at 80 9863h 
Data at 80 98B9h Data at 80 98B8h 
0h 12Ch 300 
LUF[ LUF[ 
es] bv. [70] 
up [—id ur [0] 
N [Oj N 
Zz |= 9 Zl — =] 
VS [= = 20) v [Lo] 
c [To c [Ol 
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LWLct Load Word Left-Shifted 


Syntax LWLcet src, dst 


Operands ct: the count of bytes {0, 1, 2, or 3} to shift left (ct x 8 = shift in bits) 
src: _ register, direct, 16-bit immediate-, or indirect- addressing modes 
dst: register mode (any register in CPU primary-register file) 


Opcode 
31 24 23 1615 87 0 
forsotgefel] am |»  ~*« 
Word Fields 
G src addressing modes 
00 register (any register in 
CPU primary register file) 
01 direct 
10 indirect 
11 immediate 
B src byte 
00 no shift 
01 shift left 1 byte space 
10 shift left 2-byte spaces 
11 shift left 3-byte spaces 
Operation src << {0, 1, 2, or 3} bytes and merged with dst— dst 
Description The src operand is left-shifted the specified number of bytes and merged with 


the bytes of the dsfregister that are below the left-shifted LSB of the src oper- 
and. When immediate mode is selected, this instruction performs a sign exten- 
sion of the 16-bit immediate value into a 32-bit value; then, this 32-bit value is 
shifted and merged. 
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Load Word Left-Shifted LWLct 


Status Bits If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 
N MSB of the output 
Z 1 if a zero result is generated, 0 otherwise 
V 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example LWL2 Rl, R2 
Before Instruction After Instruction 
Rt RI 
R2 R2 
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LWRct Load Word Right-Shifted 


Syntax LWRct src, ast 


Operands ct: the count of bytes {0, 1, 2, or 3} to shift right (ct x 8 = shift in bits) 
src: register, direct, 16-bit immediate-, or indirect-addressing modes 
dst: _ register mode (any register in CPU primary-register file) 


Opcode 
31 24 23 1615 87 0 
fForsotyefel] wm»  ~*«| 
Word Fields 
G src addressing modes 
00 register (any register in 
CPU primary-register file) 
01 direct 
10 indirect 
11 immediate 
B src byte 
00 no shift 
01 shift left 1 byte space 
10 shift left 2-byte spaces 
11 shift left 3-byte spaces 
Operation src >> {0, 1, 2, or 3} bytes and merged with dst— dst 
Description The src operand is right-shifted the specified number of bytes and merged with 


the bytes of the dstregister that are above the right-shifted MSB of the src op- 
erand. Sign is not extended. When immediate mode is selected, this instruc- 
tion performs a sign extension of the 16-bit immediate value into a 32-bit value; 
then, this 32-bit value is shifted and merged. 
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Load Word Right-Shifted LWRct 


Status Bits If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 
N MSB of the output 
Z 1 if a zero result is generated, 0 otherwise 
Vv 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example LWR1 AR1, R2 
Before Instruction After Instruction 
ARI ARt 
R2 R2 
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MBct Merge Byte, Left-Shifted 


Syntax MBct src, dst 


Operands ct: the count of bytes {0, 1, 2, 3} to shift left (ct x 8 = shift in bits) 
src: _ register-, direct-, or indirect-addressing modes 
dst: _ register mode (any register in CPU primary-register file) 


Opcode 
31 24 23 1615 87 ) 
fortrogelel] am] ~ ~*« 
Word Fields 
G src addressing modes 
00 register (any register in 
CPU primary-register file) 
01 direct 
10 indirect 
11 immediate 
B src byte 
00 no shift 
01 shift left 1 byte space 
10 shift left 2-byte spaces 
11 shift left 3-byte spaces 
Operation 8 LSBs of src << {0, 1, 2, or 3} bytes and merged with dst — dst 
Description The eight LSBs of the srcoperand are left shifted 0, 1, 2, or 3 bytes and merged 


with the bits of the dstregister that are below the left-shifted LSB of the src op- 
erand. When immediate mode is selected, this instruction performs a sign ex- 
tension of the 16-bit immediate value into a 32-bit value; then, this 32-bit value 
is shifted and merged. 
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Merge Byte, Left-Shifted MBct 


Status Bits If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 
N MSB of the output 
Z 1 if a zero result is generated, 0 otherwise 
V 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example MB2 AR1, AR2 
Before Instruction After Instruction 
AR1 (0012 0000h) ARI 
AR2 AR2 
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MHct Merge Half-Word, Left-Shifted 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 
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MHct src, dst 


ct: the count of half-word (16-bit) shifts 
src: _register-, direct-, 16-bit immediate-, or indirect-addressing modes 
dst: _ register mode (any register in CPU primary-register file) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 

H src half-word 

0 half-word 0 (LS half-word) 

1 half-word 1 (MS half-word) 


16 LSBs of src << {0, 1} half-words and merged with dst > dst 


The 16 LSBs of the src operand are left shifted 0 or 1 half-words and merged 
with the bits of the dst register that are below the left-shifted LSB of the src op- 
erand. When immediate mode is selected, this instruction performs a sign ex- 
tension of the 16-bit immediate value into a 32-bit value; then, this 32-bit value 
is shifted and merged. 


If ST (SET COND) = 0 and the destination register is RO—R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 

Cc Unaffected 


Mode Bit 
Cycles 


Example 


Merge Half-Word, Left-Shifted MHct 


OVM operation is not affected by OVM bit value. 


1 


MH1 AR1, AR2 


Before Instruction 


AR1 ABCD EF12h 
AR2 1234 5678h 


(EF12 0000h) 


After Instruction 


AR1 |__ABCD EF12h 
AR2 EF12 5678h 
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14-159 


MPYF Multiply Floating-Point Values 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 
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MPYF src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


31 24 23 1615 87 0 


G src addressing modes 
00 register (RO-R11) 

01 direct 

10 indirect 

11 immediate 


dst x src > dst 


The product of the dstand src operands is loaded into the dst register. The val- 
ues at src (if in register mode (RO—R11)) and dstare treated as extended-preci- 
sion floating-point numbers. For nonregister mode, src is treated as single- 
precision floating-point number. 


LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if.a floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 

N 1 if a negative result is generated, 0 otherwise 

Z 1 if a zero result is generated, 0 otherwise 

V 1 if a floating-point is overflow occurs, 0 otherwise 


Cc Unaffected. 
OVM operation is not affected by OVM bit value. 
{ 


Multiply Floating-Point Value MPYF 


Example MPYF RO,R2 
Before Instruction After Instruction 


RO |_07 0C80 0000h 1.4050e + 02 RO [07 0C80 0000h 1.4050e + 02 
R2 |_03 4020 0000h] 1.27578125e + 01 R2 |0A 600F 2000h | 1.79247266e + 03 


LuFL__ 0 LuFL__ 
wv Lo w CL] 
UF LJ UF [0] 
N [oO N [oo] 
z 0] Zz 20] 
v (| v [_] 
c Ld c [Lo] 
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MPYF3 = Multiply Floating-Point Values, 3 Operands 


Syntax 
Operands 


Opcode 


Word Fields 
Type 1 


Type 2 


Operation 
Description 


Status Bits 


Mode Bit 
Cycles 


Example 
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MPYFS8 src2, src1, dst 
src1, src2:type 1 or type 2 three-operand addressing modes 


ast: register mode (RO — R11) 
Type 1 
31 24 23 1615 87 0 


oo010010014| T| ast srct src2 


Type 2 
31 24 23 1615 87 0 


T src1 addressing modes src2 addressing modes 

00 register mode (RO — R11) register mode (RO — R11) 

01 indirect mode (disp = 0, 1, 1RO,IR1) register mode (any CPU register) 
10 register mode (RO — R11) indirect mode (disp = 0, 1, IRO, IR1) 


( 
11. indirect mode (disp = 0,1, 1RO,IR1) — indirect mode (disp = 0, 1, IRO, IR1) 


T src1 addressing modes src2 addressing modes 
01 register mode (RO — R11) depleconem) arena ane gnee 


11 indirect mode *+ARn1(5-bit unsigned indirect mode *+ARn2(5-bit unsigned 
displacement) displacement) 


src? x src2 — dst 


The product of src? and src2is loaded into the dstregister. The values at src7, 
src2 (if src? and src2 are in register mode (RO-R11)), and dst are treated as 
extended-precision floating-point numbers. If src? and src2 are in nonregister 
mode, they are assumed to be single-precision floating-point numbers. 


LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF 1 ifa floating-point underflow occurs, 0 otherwise 

N 1 if a negative result is generated, 0 otherwise 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if a floating-point is overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Parallel MPYF3 and ADDF3. MPYF3||ADDF3 


MPYFS3 srcA, srcB, dst1 
|| ADDF3 srcC, srcD, dst2 


srcA 
srcB Any two must be indirect (disp = 0, 1, IRO, IR1), and 
srcC any two must be register (RO — R7). 


srcD 
ast! register (d7): 
0 =RO 
1=R1 
dsi2 register (a2): 
0=R2 
1=R3 
srct register (RO-—R7) 
src2 register (RO-—R7) 
src3 indirect (disp = 0, 1, IRO, IR1) 
src4 indirect (disp = 0, 1, IRO, IR1) 
P parallel-addressing modes (0 < P < 3) 


Operation (P Field) 


00 src3 x src4, src1 + src2 

01 src3 x src1, src4 + src2 

10 src? x src2, src3 + src4 

11 src3 x src1, src2 + src4 

31 24 23 1615 8 7 0 
ooo 0] Plea sor | aoe | sco | om —i| 
None 


srcA x srcB > dst1 
|| srcC + srcD > dsi2 


A floating-point multiplication and a floating-point addition are performed in 
parallel. All registers are read at the beginning and loaded at the end of the 
execute cycle. This means that if one of the parallel operations (MPYF3) reads 
from a register and the operation being performed in parallel (ADDF3) writes 
to the same register, then MPYF3 accepts as input the contents of the register 
before it is modified by the ADDF3. 
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MPYF3||ADDF3 = Parallel MPYF3 and ADDF3 


Status Bits 


Mode Bit 
Cycles 


Example 


14-164 


You can code any combination of addressing modes for the four possible 
source operands as long as you code two as indirect and two as register. The 
assignment of the source operands srcA — srcD to the srci1 — src4 fields va- 
ries, depending on the combination of addressing modes used; the P field is 
encoded accordingly. The assembler may, when not significant, change the 
order of operands in commutative operations to simplify processing. 


If src2 and dst2 point to the same location, src2is read before the write to dsi2. 
LUF (1 if a floating-point underflow occurs, unchanged otherwise 


LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 


N 0 
Z 0 
Vv 1 if a floating-point overflow occurs, 0 otherwise 


Cc Unaffected 


OVM operation is not affected by OVM bit value. 


1 
MPYF3 *AR5++(1),*-- AR1(IRO),RO 
|| ADDF3 R5,R7,R3 
Before Instruction After Instruction 
AR5 80 98C5h AR5 80 98C6h 
ARI 80 98A8h ARt 80 98A4h 
IRO IRO 
RO RO [04 6718 000h | 2.888671886e + 01 


R5 | 07 33C0 0000h 1.79750e + 02 R5_ | 07 33C0 0000h 1.79750e + 02 
R7 |_07 0C80 0000h 1.4050e + 02 R7 |.07 0C80 0000h 1.4050e + 02 


R3 R3 3.20250e + 02 
Data at 80 98C5h Data at 80 98C5h 

1.2750e + 01 1.2750 + 01 
Data at 80 98A4h Data at 80 98A4h 

2.265625e + 0 2.265625e + 0 
Lur[9] Lur[—_0] 
a) w C9] 
UF [OO UF [__O 
80] N Lo] 
z Lid z L___o| 
v Lo f= —— 20) 
c Lo &. [0] 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel MPYF3 and STF_ MPYF3\||STF 


MPYFS3 src2, src1, dst1 
|| STF src3, dst2 


src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO —R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 8 7 0 


31 
04141 1 1] dstt | src1 src3 dst2 src2 


None 


src? x src2 > dst1 
|| src3 > dst2 


A floating-point multiplication and a floating-point store are performed in paral- 
lel. All registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (MPYF3) writes to a reg- 
ister and the operation being performed in parallel (STF) reads from the same 
register, then the STF accepts as input the contents of the register before it is 
modified by the MPYF3. 


If src2 and dst2 point to the same location, then src2 is read before the write 
to dsi2. 


LUF 1 if a floating-point underflow occurs, 0 unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF 1 ifa floating-point underflow occurs, 0 otherwise 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if a floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-165 


MPYF3||STF = Parallel! MPYF3 and STF 


Example 


14-166 


MPYF3 *-AR2(1),R7,RO 
\| STF R3, *ARO- — (IRO) 
Before Instruction 


AR2 
R7 6.281250e + 01 
RO 
R3 4.7031250e + 02 
ARO 
IRO 


Data at 80 982Ah 

70C 8000h 1.4050e + 02 
Data at 80 9860h 
LuF[_ 


After Instruction 


AR2 
R7 6.281250e + 01 
RO 8.82515625e + 03 
R3 4.7031250e + 02 
ARO 
IRO 


Data at 80 982Ah 
70C 8000h 1.4050e + 02 


Data at 80 9860h 


86B28 0000h| 4.7031250e + 02 


LuF[_ 
wf 
UF [CO 
N [oO] 
7 rr! 
v [9] 
c Lo] 


Syntax 


Operands 
Opcode 


Word Fields 


Operation 


Description 


Parallel MPYF3 and SUBF3. MPYF3||SUBF3 


MPYF3 ~ srcA, srcB, dst1 
|| SUBF3 ~— srcC, srcD, dsi2 


srcA 

srcB Any two must be indirect (disp = 0, 1, IRO, IR1), and 
srcC any two must be register (RO — 

srcD R7). 


None 


31 24 23 1615 87 0 
fio oo i] Plata] ser | sce | ooo | sok 
None 


srcA x srcB > dst1 
|| srcD— srcC > dst2 


dst1 register (d7): 
0 =RO 
1=RI1 
dsi2 register (a2): 
0=R2 
1=R3 
srct register (RO-R7) 
src2 register (RO-—R7) 
src3 indirect (disp = 0, 1, IRO, IR1) 
src4 indirect (disp = 0, 1, IRO, IR1) 
P parallel-addressing modes (0 < P < 3) 


Operation (P Field) 


00 src3 x src4, src1 — src2 
01 src3 x src1, src4 — src2 
10 src? x src2, src3 — src4 
11 src3 x src1, src2 — src4 


A floating-point multiplication and a floating-point subtraction are performed 
in parallel. All registers are read at the beginning and loaded at the end of the 
execute cycle. This means that if one of the parallel operations (MPYF3) reads 
from a register, and the operation being performed in parallel (SUBF3) writes 
to the same register, then MPYF3 accepts as input the contents of the register 
before it is modified by the SUBF3. 
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MPYF3||SUBF3 Parallel MPYF3 and SUBF3 


You can code any combination of addressing modes for the four possible 
source operands as long as you code two as indirect and two as register. The 
assignment of the source operands srcA — srcD to the src? — src4 fields va- 
ries, depending on the combination of addressing modes used; the P field is 
encoded accordingly. The assembler may, when not significant, change the 
order of operands in commutative operations to simplify processing. 


Status Bits LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF 1 ifa floating-point underflow occurs, 0 otherwise 


N 0 
Z 0 
Vv 1 if a floating-point overflow occurs, 0 otherwise 


Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example MPYF3 R5,*++AR7(IR1),RO 


|| SUBF3 R7, *AR3--(1),R2 


MPYF3 *++AR7(IR1), R5,RO 
| SUBF3 R7,*AR3—-(1),R2 


Before Instruction After Instruction 


14-168 


R5 1.2750e + 01 R5 1.2750e + 01 
AR7 AR7 
IR IRt 
RO RO 2.88867188e + 01 
R7 1.79750e + 02 R7 1.79750e + 02 
AR3 AR3 
R2 R2 -3.9250e + 01 
Data at 80 990Ch Data at 80 990Ch 

2.2506e + 00 2.250e + 00 
Data at 80 98B2h Data at 80 98B2h 

1.4050e + 02 1.4050e + 02 
\urF-—~* lurF[ 0] 
vl 9 w 9] 
uF [OO UF [i 
a) n [9] 
z Li z CL] 
i v Lo] 
i es) c Lol 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Multiply integer MPYI 


MPYI src, dst 


src: general-addressing modes (G) 
dst: _ register (any register in CPU primary-register file) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst x src > dst 


The product of the dstand src operands is loaded into the dsfregister. The src 
and dst operands, when read, are assumed to be 32-bit signed integers. The 
result is assumed to be a 64-bit signed integer. The output to the dst register 
is the 32 LSBs of the result. 


Integer overflow occurs when any of the 32 MSBs of the 64-bit result differs 
from the MSB of the 32-bit output value. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unchanged 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is affected by OVM bit value. 


1 
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MPYI 


Multiply Integer 


Example MPYI R1,R5 
Before Instruction 


14-170 


R1 | 00 0033 C25th 
R5 [00 0078 B600h 


LWF[_ 
vf 
uF [ 
Nn [Lo 
z CL 
. ES =4 
c [Lo 


3 392 081 
7910912 


After Instruction 


R1 [00 0033 C251h 3 392 081 
R5 [00 E21D 9600h —501 377 536 


LuF[_ 
LV 
uF [| 
N 
z CL] 
V 
c Lo] 


The result overflows and R5 contains the 32 LSBs of the result. To obtain the 
32 MSBs, use the MPYSHIS3 or the MPYUHIS instructions. 


Syntax 


Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


Description 


Multiply Integer, 3 Operands MPYI3 


MPYI3 src2, src1, dst 


src1, src2:type 1 or type 2 three-operand addressing modes 
dst: register mode (any register in CPU primary-register file) 


Type 1 
31 24 23 1615 8 


7 0 


Type 2 
31 24 23 1615 8 


7 0 


T src? addressing modes src2 addressing modes 


00 register mode (any CPU register) register mode (any CPU register) 

01 indirect mode (disp = 0, 1, IRO,IR1) — register mode (any CPU register) 

10 register mode (any CPU register) indirect mode (disp = 0, 1, IRO, IR1) 

11. indirect mode (disp = 0,1, !R0,1IR1) — indirect mode (disp = 0, 1, IRO, IR1) 

T  src?1 addressing modes src2 addressing modes 

00 register mode (any CPU register) 8-bit signed immediate 

01 register mode (any CPU register) nates nISO, se ABn SLunSgned 
displacement) 

10 indirect mode *+ARn(5-bit unsigned 8-bit signed immediate 


displacement) 


indirect mode *+ARn1(5-bit unsigned indirect mode *+ARn2(5-bit unsigned 
displacement) displacement) 


src? x src2 > dst 


The product of the numbers at src? and src2is loaded into the dstregister. The 
multiplied numbers are assumed to be 32-bit signed integers. The result is as- 
sumed to be a signed 64-bit integer. The output to the dst register is the 32 
least-significant bits of the result. 


Integer overflow occurs when any of the 32 MSBs of the 64-bit result differs 
from the MSB of the 32-bit dst value. 


Assembly Language Instructions 14-171 


MPYI3 = Multiply Integer, 3 Operands 


Status Bits 


Mode Bit 
Cycles 


Example 


14-172 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unchanged. 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
Vv 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is affected by OVM bit value. 
1 


None 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Parallel MPYI3 and ADDI3)- MPYI3||ADDI3 


MPYI3 srcA, srcB, dst1 
|| ADDI3 srcC, srcD, dsi2 


srcA 
srcB Any two must be indirect (disp = 0, 1, IRO, IR1), and 
srcC any two must be register (RO — R7). 


srcD 
dst1 register (d7): 
0 =RO 
1=R1 
dsi2 register (a2): 
0=R2 
1=R3 
srct register (RO-—R7) 
src2 register (RO-—R7) 
src3 indirect (disp = 0, 1, IRO, IR1) 
src4 indirect (disp = 0, 1, IRO, IR1) 
P parallel-addressing modes (0 < P < 3) 


Operation (P Field) 


00 src3 x src4, src?t + src2 

01 src3 x src1, src4 + src2 

10 src? x src2, src3 + src4 

11 src3 x src1, src2 + src4 

31 24 23 1615 8 7 0 
001 0 | P [dt] d2, src1 src2 src3 src4 

None 


srcA x srcB > dst1 
|| srcD + srcC > dsi2 


An integer multiplication and an integer addition are performed in parallel. All 
registers are read at the beginning and loaded at the end of the execute cycle. 
This means that if one of the parallel operations (MPY1I3) reads from a register 
and the operation being performed in parallel (ADDI3) writes to the same reg- 
ister, then MPYI3 accepts as input the contents of the register before it is modi- 
fied by the ADDIS. 
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MPYI3||ADDI3 = Parallel MPYI3 and ADDI3 


Status Bits 


Mode Bit 
Cycles 


Example 


14-174 


You can code any combination of addressing modes for the four possible 
source operands as long as you code two as indirect and two as register. The 
assignment of the source operands srcA — srcD to the srci1 — src4 fields va- 
ries, depending on the combination of addressing modes used; the P field is 
encoded accordingly. The assembler may, when not significant, change the 
order of operands in commutative operations to simplify processing. 


LUF Unchanged 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 0 
Z 0 
Vv 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is affected by OVM bit value. 
{ 


MPYI3 R7,R4,RO0 
|| ADDI3 *-AR3, *AR5—--(1),R3 


Before Instruction After Instruction 


R7 20 R7 20 
R4 100 R4 100 
RO RO 2000 
AR3|____80 981 Fh] AR3[____ 80 981Fh| 
AR5[____80 996Eh| AR5[____80 996Dh | 
RS RS 
Data at 80 981Eh Data at 80 981Eh 

-s3 -s3 
Data at 80 996Eh Data at 80 996Eh 

53 35h 53 
LUF[ LuF[_ 
w 7 w [9] 
UF LO UF LO 
N [oj hs) ———— 0] 
er) z Lo] 
VS [e =] Ve = 0) 
c [LO c [Lo] 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel MPYI3 and STI3- MPYI3||STI 


MPYI3_— src2, src1, dst1 
|| STI src3, dst2 


src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO —R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


24 23 1615 8 7 0 


31 
10000 srct src3 dst2 src2 


None 


src? x src2 > dst1 
|| src3 > dst2 


An integer multiplication and an integer store are performed in parallel. All reg- 
isters are read at the beginning and loaded at the end of the execute cycle. This 
means that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (MP Y1I3) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by the 
MPYI3. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


Integer overflow occurs when any of the 32 MSBs of the 64-bit result differs 
from the most significant bit of the 32-bit dst7 value. 


LUF Unchanged 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is affected by OVM bit value. 
{ 
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MPYI3||STI 


Example 


14-176 


Parallel MPYI3 and STI3 


MPYI3 *++ARO(1),R5,R7 


|| STI R2, *-AR3 (1) 
Before Instruction 


ARO|____80 995Ah| 
R5 
R7 
R2 
AR3[____ 80 982Fh] 


Data at 80 995Bh 
0C8h 


Data at 80 982Eh 
Oh 


Lur OT 


50 


220 


200 


After Instruction 


ARO|___ 80 995Bh| 
R5 
R7 
R2 
AR3[____80 982Fh| 


Data at 80 995Bh 
O0C8h 


Data at 80 982Eh 
ODCh 


Lur [0] 


50 
10000 
220 


200 


220 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Parallel MPYI3 and SUBI3)- MPYI3||SUBI3 


MPYI3 — srcA, srcB, dst1 
|| SUBI3 — srcC, srcD, dst2 


srcA 
srcB Any two must be indirect (disp = 0, 1, IRO, IR1), and 
srcC any two must be register (RO — R7). 


srcD 
dst1 register (d7): 
0 =RO 
1=R1 
dsi2 register (a2): 
0=R2 
1=R3 
srct register (RO-—R7) 
src2 register (RO-—R7) 
src3 indirect (disp = 0, 1, IRO, IR1) 
src4 indirect (disp = 0, 1, IRO, IR1) 
P parallel-addressing modes (0 < P < 3) 


Operation (P Field) 


00 src3 x src4, src1 — src2 

01 src3 x src1, src4 — src2 

10 src? x src2, src3 — src4 

11 src3 x src1, src2 — src4 

31 24 23 1615 8 7 0 
oor 1] Plea] wei] ace | ocd | wot —id 
None 


srcA x srcB > dst1 
|| srcD— srcC > dsi2 


An integer multiplication and an integer subtraction are performed in parallel. 
All registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (MPYI3) reads from a 
register and the operation being performed in parallel (SUBI3) writes to the 
same register, then MPYI3 accepts as input the contents of the register before 
it is modified by the SUBIS. 


You can code any combination of addressing modes for the four possible 
source operands as long as you code two as indirect and two as register. The 
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MPYI3||SUBI3 = Parallel MPYI3 and SUBI3 


Status Bits 


Mode Bit 
Cycles 


Example 


14-178 


assignment of the source operands srcA— srcD to the src1— src4 fields varies, 
depending on the combination of addressing modes used; the P field is en- 
coded accordingly. The assembler may, when not significant, change the order 


of operands in commutative operations in order to simplify processing. 


Integer overflow occurs when any of the 32 MSBs of the 64-bit result differs 


from the MSB of the 32-bit output value. 


LUF Unchanged 


LV 1 if an integer overflow occurs, unchanged otherwise 


UF 1 if an integer underflow occurs, 0 otherwise 


N 0 
Z 0 
Vv 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 


OVM operation is affected by OVM bit value. 


{ 


MPYI3 R2,*++ARO(1),RO 
|| SUBI3 *AR5--—-(IR1),R4,R2 
or 

MPYI3 *++ARO(1),R2,RO0 
l| SUBI3 *AR5-—— (IR1),R4,R2 

Before Instruction 

R2 50 
ARO 
RO 
AR5 
IR1 
R4 2000 
Data at 80 98E4h 

62h 98 
Data at 80 99FCh 

1200 
uF ——~*d 
vl 
UPL 
N [oO 
| eT) 

[ 

c Lo 


After Instruction 


R2 
ARO|__80 98E4h | 
RO 
AR5|___80 99F0h| 
IR1 
R4 


Data at 80 98E4h 
62h 


Data at 80 99FCh 


4BOh 
Lur [J] 


800 


4900 


2000 


98 


1200 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Multiply Signed Integer and Produce 32 MSBs_ MPYSHI 


MPYSHI _ src, dst 


src: general-addressing modes 
dst: register mode (any register in CPU primary-register file) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst x src > dst 


The 32 MSBs of the product of the numbers at dst and src are loaded into the 
dstregister. These numbers, when read, are assumed to be signed 32-bit inte- 
gers. The result is assumed to be a signed 64-bit integer. The output to the dst 
register is the 32 MSBs of the result. The MPYI instruction provides the 32 
LSBs of the result. 


If ST (SET COND) = 0 and the destination register is RO—R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unchanged 
LV Unchanged 


UF 0 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if all 64 bits of the product are 0, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 
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MPYSHI3 = Multiply Signed Integer Producing 32 MSBs, 3 Operands 


Syntax 


Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


Description 


14-180 


MPYSHI3 _ src2, src1, dst 


src1: type 1 or type 2 three-operand addressing modes 
src2: type 1 or type 2 three-operand addressing modes 
register mode (any register in CPU primary-register file) 


dst: 


Type 1 


31 


24 23 1615 


7 0 


Type 2 


31 


24 23 1615 


7 0 


8 


8 


T  src1 addressing modes src2 addressing modes 

00 register mode (any CPU register) register mode (any CPU register) 

01 indirect mode (disp = 0, 1, IR0O,IR1) register mode (any CPU register) 

10 register mode (any CPU register) indirect mode (disp = 0, 1, IRO, IR1) 

11. indirect mode (disp = 0,1,!R0,IR1) — indirect mode (disp = 0, 1, IRO, IR1) 

T  src1 addressing modes src2 addressing modes 

00 register mode (any CPU register) 8-bit signed immediate 

01 register mode (any CPU register) eel MOS: “eNO NO BIPUNSIONEd 
displacement) 

10 indirect mode *+ARn(5-bit unsigned 8-bit signed immediate 


displacement) 


indirect mode *+ARn1(5-bit unsigned 
displacement) 


src? x src2 > dst 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


The product of the numbers at the src? and src2 operands is loaded into the 
dst register. The numbers at the src? and src2 operands are assumed to be 
32-bit signed integers. The result is assumed to be a signed 64-bit integer. The 
output to the dst register is the 32 MSBs of the result. The MPYI3 instruction 
provides the 32 LSBs of the result. 


Status Bits 


Mode Bit 
Cycles 


Example 


Multiply Signed Integer Producing 32 MSBs, 3 Operands MPYSHI3 


If ST (SET COND) = 0 and the destination register is RO—R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unchanged 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an integer overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 
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MPYUHI = Multiply Unsigned Integer and Produce 32 MSBs 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-182 


MPYUHI src, dst 


src: general-addressing modes 
dst: register mode (any register in CPU primary-register file) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst x src > dst 


The 32 MSBs of the product of the numbers at dstand srcoperands are loaded 
into the dstregister. These numbers, when read, are assumed to be unsigned 
32-bit integers. The result is assumed to be an unsigned 64-bit integer. The 
output to the dst register is the 32 MSBs of the result. The MPYI instruction 
provides the 32 LSBs of the result. 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unchanged 
LV Unchanged 


UF 0 
N 0 
Z 1 if all 64 bits of the product are 0, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 


Syntax 


Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


Description 


Multiply Unsigned Integer Producing 32 MSBs, 3 Operands MPYUHI3 


MPYUHI3 src2, src1, dst 


src1, src2:both type 1 or type 2 three-operand addressing modes 
register mode (any register in CPU primary-register file) 


dst: 


Type 1 


31 


24 23 1615 


7 0 


Type 2 


8 


T  src?1 addressing modes src2 addressing modes 

00 register mode (any CPU register) register mode (any CPU register) 

01 indirect mode (disp = 0, 1, 1RO,IR1) — register mode (any CPU register) 

10 register mode (any CPU register) indirect mode (disp = 0, 1, IRO, IR1) 

11. indirect mode (disp = 0,1, !1R0,1IR1) — indirect mode (disp = 0, 1, IRO, IR1) 

T src? addressing modes src2 addressing modes 

00 register mode (any CPU register) 8-bit signed immediate 

01 register mode (any CPU register) NGUSG! ide ARIS Lun Saned 
displacement) 

10 indirect mode *+ARn(5-bit unsigned 8-bit signed immediate 


displacement) 


indirect mode *+ARn1(5-bit unsigned 
displacement) 


src? x src2 > dst 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


The product of the numbers at the src? and src2 operands is loaded into the 
dst register. The numbers at the src7 and src2 operands are assumed to be 
32-bit signed integers. The result is assumed to be an unsigned 64-bit integer. 
The output to the dst register is the 32 MSBs of the result. The MPYI3 instruc- 
tion provides the 32 LSBs of the result. 


Assembly Language Instructions 14-183 


MPYUHI3 = Multiply Unsigned Integer Producing 32 MSBs, 3 Operands 


Status Bits 


Mode Bit 
Cycles 


Example 


14-184 


If ST (SET COND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SET COND) = 1, they are modified for all destination 
registers. 


LUF Unchanged 
LV Unchanged 


UF 0 
N 0 
Z 1 if all 64 bits of the product are 0, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
1 


None 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Negate Integer With Borrow NEGB 


NEGB src, dst 


src: general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


31 24 23 1615 8 7 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


0-src—C- ast 


The difference of the 0, src, and C operands, calculated as shown, is loaded 
into the dst register. The dst and src are assumed to be signed integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (GETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


Assembly Language Instructions 14-185 


NEGB _Negate Integer With Borrow 


Example 


14-186 


NEGB R5,R7 
Before Instruction 


R5 OFFFF FFCBh 


R7 [oh] 
LuF[__ 
vf 
uF [0 
N [Lo 
z Co 
|| 
ee 


After Instruction 


R5 (|_OFFFF FFCBh 


R7 [34h] 
LuF[ 
w [| 
uF [| 
N [| 
z CL] 
v= — S30] 
¢ C4] 


52 


Negate Floating-Point Value NEGF 


Syntax NEGF src, dst 
Operands src: _general-addressing modes (G) 
dst: register (RO — R11) 
Opcode 
31 24 23 1615 87 0 
Word Fields 
G src addressing modes 
00 register (any register in 
CPU primary-register file) 
01 direct 
10 indirect 
11 immediate 
Operation 0- src —> dst 
Description The difference of the 0 and src operands is loaded into the dst register. The 
dst and src operands are assumed to be floating-point numbers. 
Status Bits LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF (1 if a floating-point underflow occurs, 0 otherwise 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if a floating-point overflow occurs, 0 otherwise 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 


Assembly Language Instructions 14-187 


NEGF _Negate Floating-Point Value 


Example 


14-188 


NEGF *++AR3(2),R1 
Before Instruction 


AR3 80 9800h 
R1 05 7B40 0025h| 6.28125006e + 01 


Data at 80 9802h 


70C 8000h 1.4050e + 02 


Ye 
UF =] 
rs) 
2 
er) 
es) 


After Instruction 


AR3[___ 80. 9802h | 
Rt 
Data at 80 9802h 

vv CL] 
UF [0] 
N CL 0] 
Zz CL 
oe | 
c¢ Ll 


—1.4050e + 02 


1.4050e + 02 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel NEGF and STF_ NEGF||STF 


NEGF src2, dst1 
|| STF — src3, dst2 


src2: indirect 
dst1: register 
src3: register 
dst2: indirect 


— 


disp = 0, 1, IRO, IR1) 
RO — R7) 
RO — R7) 
disp = 0, 1, IRO, IR1) 


PS at Goat 


3 7 0 


1 24 23 1615 8 
roo ot] om [ooo] ma | aa | wa | 


None 


0 -— src2 > dst1 
|| src3 > dst2 


A floating-point negation and a floating-point store are performed in parallel. 
All registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STF) reads from a reg- 
ister and the operation being performed in parallel (NEGF) writes to the same 
register, then STF accepts as input the contents of the register before it is mo- 
dified by the NEGF. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


LUF 1 if a floating-point underflow occurs, 0 unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if a floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Assembly Language Instructions 14-189 


NEGF||STF Parallel NEGF and STF 


Example 


14-190 


NEGF *AR4--(1),R7 


R2,*++AR5 (1) 


Before Instruction 


AR4|___ 80 98Eth| 
R7 
R2 
AR5|____ 80 9803h| 


Data at 80 98E1h 


57 B40 0000h 


Data at 80 9804h 
0h 


LuFL__ 


1.79750e + 02 


6.281250e + 01 


After Instruction 


AR4 80 98E0h 


R7 [05 84C0 0000h| —6.281250e + 01 


R2 [07 33C0 0000h 


AR5|___80 9804h | 
Data at 80 98E1h 

Data at 80 9804h 

LuF[ 
er) 
UF [= =] 
N [0] 
z Lo] 
[= ——— 70] 
[= = —=0) 


1.79750e + 02 


6.281250e + 01 


1.79750e + 02 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Negate Integer NEGI 


NEGI src, dst 


src: general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


31 24 23 1615 8 7 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


0 -src— dst 


The difference of the 0 and src operands is loaded into the dst register. The 
dst and src operands are assumed to be signed integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SGETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


Assembly Language Instructions 14-191 


NEGI  Negate Integer 


Example NEGI 174,R5 (174 = OAEh) 
Before Instruction After Instruction 
R5 220 R5 -174 
LUFL__ LUF [== 70] 
vt w [ol 
UPL UF LO 
n [oj N 
ro eT: z [| 
v 0] v [_] 
ce [ c 


14-192 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Parallel NEGI and ST! NEGI||STI 


NEGI src2, dst1 
|| STI src3, dst2 


src2: indirect 
dst1: register 
src3: register 
dst2: indirect 


— 


disp = 0, 1, IRO, IR1) 
RO — R7) 
RO - R7) 
disp = 0, 1, IRO, IR1) 


=~ —_~ —~ 


3 7 0 


1 24 23 1615 8 


None 


0 -—src2 > dst1 
|| src3 > dst2 


Aninteger negation and an integer store are performed in parallel. All registers 
are read at the beginning and loaded at the end of the execute cycle.This 
means that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (NEGI) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by the 
NEGI. 


If src2. and dst2 point to the same location, src2is read before the write to dst2. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


Assembly Language Instructions 14-193 


NEGI||STI = Parallel NEGI and STI 


Example NEGI *-AR3,R2 
| STI R2,*AR1++ 
Before Instruction After Instruction 
AR3|____80 982Fh| AR3[____80 982Fh | 
R2 25 R2 -220 
ARI ARI 
Data at 80 982Eh Data at 80 982Eh 
220 220 
Data at 80 98A5h Data at 80 98A5h 
oh 25 
LUFL__ LUFL__ 
vw Ld w [9] 
UF [___ UF LO 
N Lo N 
2 —— = 0 z fo 
vo 9 vo 9 
c Lo c 


14-194 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 1 


Example 2 


No Operation NOP 


NOP src 


src: general-addressing modes (G) 


31 24 23 1615 8 7 0 


fo ooj01 1001/ G dst src 


G___ssrcaddressing modes 
00 __— register (no operation) 
10 indirect (modify ARn, O<n <7) 


No ALU or multiplier operations. 
ARn is modified if src is specified in indirect mode. 


If the src operand is specified in the indirect mode, the specified addressing 
operation is performed, and a dummy memory read occurs. If the src operand 
is omitted, no operation is performed. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


NOP 
Before Instruction After Instruction 
PC PC 


NOP *AR3--(1) 


Before Instruction After Instruction 


PC PC 
AR3 80 9900h ARS 80 98FFh 


Assembly Language Instructions 14-195 


NORM Normalize 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


14-196 


NORM src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


31 24 23 1615 87 0 


foooj011010| G| ast src 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


norm (src) > dst 


The src operand is assumed to be an unnormalized floating-point number; for 
example, the implied bit is set equal to the sign bit. The dst is set equal to the 
normalized src operand with the implied bit removed. The dst operand expo- 
nent is set to the src operand exponent minus the size of the left-shift neces- 
sary to normalize the src. The dst operand is assumed to be a normalized floa- 
ting-point number. 


For values of src: 


Lj If sre (exp) = -128 and src (man) = 0, then dst= 0, Z=1, and UF =0. 

Lj If src (exp) = -128 and src (man) # 0, then dst = 0, Z = 0, and UF = 1. 

Lj For all other cases of the src, if a floating-point underflow occurs, then 
dst (man) is forced to 0 and dst (exp) = —128. If src (man) = 0, then 
dst (man) = 0 and dst (exp) = —128. Refer to Section 5.7 on page 5-27. 


LUF 1 if a floating-point underflow occurs, unchanged otherwise 
LV Unaffected 
UF 1 ifa floating-point underflow occurs, 0 otherwise 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Normalize NORM 


Example NORM R1,R2 
Before Instruction After Instruction 
Rt Rt 
R2 R2 [F26BD4 0000h] 1.12451613e — 04 
LuFL__ LUFL__ 
vt ida wv Lio] 
UPL UF LO 
nr vn C9] 
2... —— 9 rr | 
v C9 v 9] 
c [Lo c [ol 


Assembly Language Instructions 14-197 


NOT _Bitwise Logical Complement 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-198 


NOT src, dst 


src: general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


24 23 1615 87 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


~src > dst 


The bitwise-logical complement of the src operand is loaded into the dst regis- 
ter. The complement is formed by a logical NOT of each bit of the src operand. 
The dst and src operands are assumed to be unsigned integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is affected by OVM bit value. 
{ 


Bitwise Logical Complement NOT 


Example NOT @982Ch,R4 

Before Instruction After Instruction 
DP DP 
Ra R4 
Data at 80 982Ch Data at 80 982Ch 

SE2Fh 
LuF[__ LuF[_ 
vf vw Lo] 
UF [__ UF [0 
a) N 
z C9 z [9] 
v Lo v Lo] 
¢ [ —__ 4 c Lo] 


Assembly Language Instructions 14-199 


NOT||STI = Parallel NOT and STI 


Syntax NOT ~ src2, dst1 
|| STI src3, dst2 


Operands src2: indirect (disp = 0, 1, IRO, IR1) 
dst?1: register (RO — R7) 
src3: register (RO — R7) 


dsi2: indirect 


— 


disp = 0, 1, IRO, IR1) 


Opcode 
31 24 23 1615 87 0 
roots] ot [000] wa | oe | 2 ~~ 
Word Fields None 
Operation ~src2 > dst1 
|| src3 > dst2 
Description A bitwise-logical NOT and an integer store are performed in parallel. All regis- 
ters are read at the beginning and loaded at the end of the execute cycle. This 
means that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (NOT) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by the 
NOT. 
If src2 and dst2 point to the same location, src2is read before the write to dst2. 
Status Bits LUF Unaffected 
LV Unaffected 
UF 0 
N MSB of the output 
Z 1 if a zero result is generated, 0 otherwise 
Vv 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 


14-200 


Parallel NOT and ST! NOT||STI 


Example NOT *+AR2,R3 
|| STI R7,*-- AR4 (IR1) 


Before Instruction After Instruction 


AR2[____80 99CBh] AR2[___80 99CBh | 
R3 R3 
R7 220 R7 220 
AR4|____ 80 9850h] AR4|___ 80 9840h | 
IR1 IR1 
Data at 80 99CCh Data at 80 99CCh 
Data at 80 9840h Data at 80 9840h 
0h ODCh 220 
LuF[—_~ LUFT 
wv Lo = ——— 0] 
UF LO UF 
N Lo N Lo] 
Z |-—— — 9] Zi —— = 20] 
VE — 0] v [Lo] 
C2) =— =) c [Ol 
Assembly Language Instructions 14-201 


OR __Bitwise Logical OR 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-202 


OR src, dst 


src: general-addressing modes (G) 
dst: _ register (any register in CPU primary-register file) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst OR sre > dst 


The bitwise-logical OR between the srcand dst operands is loaded into the dst 
register. The dst and src operands are assumed to be unsigned integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Example 


OR *++AR1(IR1),R2 
Before Instruction 


ARI 80 9800h 
IRI 
Re [01256 0000h 


Data at 80 9804h 
2BCDh 


luF[_ 


Bitwise LogicalOR OR 


After Instruction 


ARI 80 9804h 
IR1 
R2 [01256 2BCDh 


Data at 80 9804h 
2BCDh 


LuFL__ 


Assembly Language Instructions 


14-203 


OR3 _ Bitwise Logical OR, 3 Operands 


Syntax 
Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


14-204 


ORS src2, src1, dst 


src1, src2:type 1 or type 2 three-operand addressing modes 


dst: register mode (any register in CPU primary-register file) 

Type 1 

31 24 23 1615 87 0 
COTO O tot tT). ost srct sro2 
Type 2 

31 24 23 1615 87 0 


oo1t1oi1011{ tT] ast 


00 
01 


src? addressing modes 

register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 
register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 


src?1 addressing modes 
register mode (any CPU register) 


register mode (any CPU register) 


indirect mode *+ARn(5-bit unsigned 
displacement) 


indirect mode *+ARn1(5-bit unsigned 
displacement) 


src1 OR src2 > dst 


src2 addressing modes 

register mode (any CPU register) 
register mode (any CPU register) 
disp = 0, 1, IRO, IR1) 
disp = 0, 1, IRO, IR1) 


indirect mode ( 
( 


indirect mode 


src2 addressing modes 

8-bit signed immediate 

indirect mode *+ARn(5-bit unsigned 
displacement) 

8-bit signed immediate 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Bitwise Logical OR, 3 Operands OR3 


The bitwise-logical OR between the numbers at the src? and src2 operands 
is loaded into the dstregister. The numbers at the src7, src2, and dstoperands 
are assumed to be unsigned integers. The src2 immediate-addressing mode 
is sign extended. 


If ST (SETCOND) = 0, the condition flags are modified if the destination regis- 
ter is RO — R11. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero result is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-205 


OR3||STI Parallel OR3 and STI 


Syntax OR3 src2, src1, dst1 
|| STI src3, dst2 


Operands src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst?1: register (RO —R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


Opcode 
31 24 23 1615 87 0 
Word Fields None 
Operation src1 OR src2 > dst1 
|| src3 > dst2 
Description A bitwise-logical OR and an integer store are performed in parallel. All regis- 
ters are read at the beginning and loaded at the end of the execute cycle. This 
means that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (OR3) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by the 
ORS. 
If src2 and dst2 point to the same location, src2is read before the write to dsi2. 
Status Bits LUF Unaffected 
LV Unaffected 
UF 0 
N MSB of the output 
Z 1 if a zero result is generated, 0 otherwise 
Vv 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 


14-206 


Example 


OR3 *++AR2,R5,R2 
|| STI R6, *AR1- - 
Before Instruction 


AR2|____ 80 9830h| 
R5 
R2 
R6 
AR1[____ 80 9833h] 


Data at 80 9831h 


9800h 


Data at 80 9883h 
Oh 


LuF[__ 


220 


Parallel OR3 and STI 


After Instruction 


AR2|____80.9831h| 
R5 
R2 
R6 
AR1[___ 80. 9882h | 


Data at 80 9831h 


9800h 


Data at 80 9883h 
ODCh 


LuFL__ 


Assembly Language Instructions 


OR3]||STI 


220 


220 


14-207 


POP POP Integer 


Syntax POP ast 

Operands dst: register (any register in CPU primary-register file) 

Opcode 
31 24 23 1615 87 0 
fooojot 110 ofoi] wm [ooo 0o00000000000 

Word Fields None 

Operation *SP-—- > dst 

Description The top of the current system stack is popped and loaded into the 32 LSBs of 


the dst register. The top of the stack is assumed to be a signed integer. The 
POP is performed with a post decrement of the stack pointer. The eight MSBs 
(exponent) of an extended-precision dstregister (R11—RO) are left unmodified. 
If required they can be recovered with a POPF instruction. 


Status Bits If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
Vv 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example POP R3 
Before Instruction After Instruction 
SP SP 
R3 4 826 R3 -62 044 
Data at 80 9856h Data at 80 9856h 
62 044 62 044 
LuF[___0] LuF[ 
w [9] w [9] 
UP [___ UF LO 
N Lo N 
z[ Z. [= 7G) 
v Lo v [Lo] 
c Lo c [Lo] 


14-208 


Syntax 
Operands 


Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


POP Floating-Point Value POPF 


POPF dst 


dst: register (RO — R11) 


31 1615 87 0 


24 23 
fooo0jo1 1101/01] dt |] 0000000000000000 


None 
*SP—— > dst 


The top of the current system stack is popped and loaded into the dst register 
(32 MSBs). The eight LSBs of the dst register mantissa are set to 0. For this 
reason, POPF must be executed before the POP instruction when you are pre- 
serving the entire 40 register bits. The top of the stack is assumed to be a floa- 
ting-point number. The POP is performed with a postdecrement of the stack 
pointer. 


LUF Unaffected 


UF 0 

LV Unaffected 

N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 


1 


POPF R4 
Before Instruction After Instruction 
sP sP 
R4 6.91186578e + 00 R4 5.32544007e + 28 
Data at 80 984Ah Data at 80 984Ah 
5.32544007e + 28 5.32544007e + 28 
luF[_ LuF[_ 
vf i er) 
ve [a ve [0] 
N [Oj N [0] 
a) z Lo] 
a.) rer v Lo 
c Lo c Lo 


Assembly Language Instructions 14-209 


PUSH § PUSH Integer 


Syntax PUSH src 
Operands src: _ register (any register in CPU primary-register file) 
Opcode 
31 24 23 1615 87 0 


01111 0/01] sc ]0000000000000000 


Word Fields None 
Operation src > *++SP 


Description The contents of the src register (82 LSBs) are pushed onto the current system 
stack. The integer or mantissa portion of an extended-precision register 
(RO-R11) is saved with this instruction. The 8 MSBs (exponent) can be pushed 
with the PUSHF instruction. The src is assumed to be a signed integer. The 
PUSH is performed with a preincrement of the stack pointer. 


Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example PUSH R6 
Before Instruction After Instruction 
SP SP 
R6 33 115 R6 33 115 
Data at 80 98AFh Data at 80 98AFh 
33.115 
lurid] Lur[______O 
vw fo w 9] 
UF [ UF [ 
nC 9 n C9] 
z C9 z C9] 
vy Lo] 0) 
c [LO c Lo] 


14-210 


Syntax 
Operands 


Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


PUSH Floating-Point Value PUSHF 


PUSHF src 


src: register (RO — R11) 


31 1615 8 7 0 


24 23 
fooojo1 1111/01] se ]0000000000000000 


None 
src > *++SP 


The contents of the src register (82 MSBs) are pushed onto the current system 
stack. The src is assumed to be a floating-point number. The PUSH is per- 
formed with a preincrement of the stack pointer. The eight LSBs of the mantis- 
sa are not saved (notice the difference in R2 and the value on the stack in the 
example below), but they can be saved with the PUSH instruction. PUSHF 
should be executed after the PUSH instruction. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Zz Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 


1 


PUSHF R2 
Before Instruction After Instruction 
sP SP 
R2 | 025C12 808th] 6.87725854e + 00 R2 [ 025C12 8081h | 6.87725854e + 00 
Data at 80 9802h Data at 80 9802h 
Oh 6.87725830e + 00 

luF[ LuF[_ 
vt vl 
Ur [ UF [0 
NCL N [LO] 

| z CL] 
vo 0) Vv LW 9] 
c Lo ¢ CL 9] 


Assembly Language Instructions 14-211 


RCPF Reciprocal of Floating-Point Value 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-212 


RCPF src, dst 


src: extended-precision register-, direct- and indirect-addressing modes 
dst: RO-RIi11 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


16-bit reciprocal of src > dst 


The 16-bit approximation of the reciprocal of the src operand is loaded into the 
dst register. The dst and src operands are assumed to be floating-point num- 
bers. 


LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if. floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 

N 1 if a negative result is generated, 0 otherwise 

Z 1 is a zero result, 0 otherwise 

Vv 1 if a floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
1 


None 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


Example 


Return From Interrupt or Trap Conditionally RETIcond 


RETlcond 

None 

31 24 23 1615 87 0 
01111000000] cod|!/0000000000000000 
None 


If (cond is true) 
*(SP) > PC 
ST(PGIE) — ST(GIE) 
ST(PCF) > ST(CF) 
Else, continue 


If the condition is true, then the top of the stack is popped to the PC, PGIE is 
copied to GIE, and PCF is copied to CF. If the condition is not true, then contin- 
ue normal operation (see Section 14.2 on page 14-12 for a list of condition 
mnemonics, encoding, and flags). 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Zz Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
4 


None 


Assembly Language Instructions 14-213 


RETlcondD = Return From Interrupt or Trap Conditionally Delayed 


Syntax RETIcondD 
Operands None 
Opcode 
31 24 23 1615 87 0 


01111 00000 14 cond 0000000000000000 


Word Fields None 


Operation If (cond is true) 
*(SP) — PC 
ST(PGIE) — ST(GIE) 
ST(PCF) > ST(CF) 
Else, continue 


Description Performs a delayed return from an interrupt or trap. 


Because this is a delayed return, the three instructions following the 
RETlconaD are fetched and executed. These three instructions should not 
modify the program flow, load the status register, or modify the stack pointer 
(SP) register. See Section 14.2 for a list of condition mnemonics, encoding, 
and flags. 


Interrupts are disabled for the duration of the RETIconaD. 


Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example None 


14-214 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Return From Subroutine Conditionally RETScond 


RETScond 


None 


31 242 87 0 


3 1615 
011 11/0 001/00] cond |0000000000000000 


None 


If cond is true: 
*SP—— > PC. 
Else, continue. 


A conditional return is performed. If the condition is true, the top of the stack 
is popped to the PC. 


The ’C4x provides 20 condition codes that can be used with this instruction 
(see Section 14.2 for a list of condition mnemonics, encoding, and flags). 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 


4 
RETSGE 
Before Instruction After Instruction 

PC PC 
SP SP 
Data at 80 983Ch Data at 80 983Ch 
LuF[_0] LUFT 
wo 9 w 9] 
UPL UF L__ 
ee) vn [9 

=——==a) 7. —— 0 
v Lo v Lo] 
c Lo c Lol 


Assembly Language Instructions 14-215 


RND Round Floating-Point Value 


Syntax RND src, dst 
Operands src: general-addressing modes (G) 
dst: register (RO — R11) 
Opcode 
31 24 23 1615 87 0 
Word Fields 
G src addressing modes 
00 register (any register in 
CPU primary-register file) 
01 direct 
10 indirect 
11 immediate 
Operation rnd(src) > dst 
Description The result of rounding the src operand is loaded into the dst register. The src 
operand is rounded to the nearest single-precision floating-point value. If the 
src operand is exactly halfway between two single-precision values, it is 
rounded to the most positive of those values. Notice that the rounding of 0 does 
not set the zero (z) status bit but the underflow bit. 
Status Bits LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if a floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs or the src operand is zero, 0 
otherwise 
N 1 if a negative result is generated, 0 otherwise 
Z Unaffected 
Vv 1 if a floating-point overflow occurs, 0 otherwise 
Cc Unaffected 
Mode Bit OVM operation is affected by OVM bit value. 
Cycles 1 


14-216 


Round Floating-Point Value RND 


Example RND R5,R2 
Before Instruction After Instruction 
R5 [07 33C1 6EEFh| 1.79755599e + 02 R5 [07 33C1 6EEFh | 1.79755599e + 02 
R2 R2 [07 33C1 6EEFh | 1.79755600e + 02 
Lur[9] LuF[___0] 
vod wlio 
ur [9] ur [0] 
N [0] N [0] 
z Cd 2 
v ~ Oo v Lo] 
C370] rr 


Assembly Language Instructions 14-217 


ROL Rotate Left 


Syntax 
Operands 
Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-218 


ROL ast 


dst: register (any register in CPU primary-register file) 


31 24 23 1615 87 0 
10 0011/11] dt |0000000000000001 
None 


dst left-rotated 1 bit — dst 


The contents of the dst operand are left-rotated one bit and loaded into the dst 
register. This is a circular rotate with the MSB transferred into the LSB. 


dst 


> 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


Rotate left: 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Set to the value of the bit rotated out of the high-order bit 
OVM operation is not affected by OVM bit value. 
{ 


ROL R3 
Before Instruction After Instruction 

R3 R3 
LF LUFL__ 
v C79 wv 9 
u [id ur 
NC 9 n [0] 
z [9] z [9] 
v C9 v [9] 
cfd C 


Syntax 
Operands 
Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 1 


Rotate Left Through Carry ROLC 


ROLC ast 


dst: register (any register in CPU primary-register file) 


31 24 23 1615 87 0 
foo0j10 0100/11] dt |0000000000000001 
None 


dst left-rotated 1 bit through carry bit > dst 


The contents of the dst operand are left-rotated one bit through the carry bit 
and loaded into the dstregister. The MSB is rotated to the carry bit, at the same 
time the carry bit is transferred to the LSB. 


Rotate left through carry bit: 


| dst 


> 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (GETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Set to the value of the bit rotated out of the high-order bit 
OVM operation is not affected by OVM bit value. 
{ 


ROLC R3 
Before Instruction After Instruction 

R3 R3 
LUFL__ LUFL__ 
vw C9 w C9) 
u Cj uF 9] 
nC 9 vn C9] 
z [9 z [0] 
7 | i 
C . (=== 9) 


Assembly Language Instructions 14-219 


ROLC Rotate Left Through Carry 


Example 2 ROLC R3 
Before Instruction After Instruction 
R3 R3 
WWF LUFL__ 
vw 9 wv 9 
uid ur ___—i] 
N [Of Ne. [= ==G] 
Z-[ 0 Zz" (= ___ 70] 
v ~ v Lo] 
c LO C 


14-220 


Syntax 
Operands 
Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Rotate Right ROR 


ROR ast 


dst: register (any register in CPU primary-register file) 


31 24.23 1615 87 0 
LOO TOA et. tote tot bP Od eh ee 
None 


dst right-rotated 1 bit — dst 


The contents of the dst operand are right-rotated one bit and loaded into the 
dst register. The LSB is rotated into the carry bit and also transferred into the 
MSB. 


Rotate right: 


dst > 


< 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) =1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Set to the value of the bit rotated out of the high-order bit 
OVM operation is not affected by OVM bit value. 
{ 


ROR R7 
Before Instruction After Instruction 

R7 R7 
LUFL__ LUFL__ 
vito vl 
a :) UF [OO 
N Lo N 
z [Lo z[o] 
y=) v [0] 
c Lo C 


Assembly Language Instructions 14-221 


RORC Rotate Right Through Carry 


Syntax 
Operands 
Opcode 


Word Fields 
Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-222 


RORC ast 


dst: register (any register in CPU primary-register file) 


34 24 23 1615 87 0 
AsO 205cths0'|) Met | sae desi ato ae het a 
None 


dst right-rotated 1 bit through carry bit > dst 


The contents of the dst operand are right-rotated one bit through the status 
register’s carry bit. This could be viewed as a 33-bit shift. The carry bit value 
is rotated into the MSB of the dst, at the same time, the dsfLSB is rotated into 
the carry bit. 


Rotate right through carry bit: 


dst | 


< 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Set to the value of the bit rotated out of the high-order bit 
OVM operation is not affected by OVM bit value. 
{ 


RORC R4 
Before Instruction After Instruction 

R4 R4 
LWF[_ LUFL__ 
iv tsi Vv Lo 
UF uF (__] 
N [Of N Lo] 
z 0] Z- [220] 
v ~ v Lo] 
c Lo C 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Repeat Block RPTB 


RPTB src 


src: 24-bit signed immediate displacement or register mode 


For 24-bit signed immediate or register mode: 


31 24 23 1615 87 0 
011003100 src (displacement) 

For register mode: 

31 24 23 1615 87 0 
0111100101000000000000000000| sr | 
None 


src + PC +1 > RE 
1 —> ST (RM) 
Next PC ~ RS 


RPTB allows a block of instructions to be repeated a number of times without 
any penalty for looping. 


It activates the block repeat mode of updating the PC. The src operand can 
be a 32-bit register value or a 24-bit signed immediate value (displacement). 
The resulting src address is the end address of the block to be repeated. This 
address is loaded into the repeat end address (RE) register. A 1 is written into 
the repeat mode bit of status register [ST(RM)] to indicate that the PC is to be 
updated in the repeat mode. The address of the next instruction is loaded into 
the repeat-start address (RS) register. 


RE should be greater than or equal to RS (RE = RS). Otherwise, the code does 
not repeat, even though the RM bit remains set to 1. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
4 


None 


Assembly Language Instructions 14-223 


RPTBD Repeat Block Delayed 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


14-224 


RPTBD src 


src: 24-bit signed immediate displacement or register mode 


For 24-bit signed immediate or register mode: 
31 24 23 1615 87 


01100101 src (displacement) 


For register mode: 


oO 


31 24 23 1615 87 


0 
011110011]/000000000000000000| sr | 


None 


if src is an immediate value (displacement) 
src+PC+3—RE 
Else: 
src > RE 
1 > ST(RM) 
PC of RPTBD + 4 > RS 


RPTBD allows a block of instructions to be repeated a number of times without 
any penalty for looping and with single-cycle execution of the RPTBD instruc- 
tion. It activates the block repeat mode of updating the PC. The src operand 
can be a 32-bit register value or a 24-bit signed immediate value (displace- 
ment). The resulting src address is loaded into the repeat-end address (RE) 
register (block-end address). A 1 is written to the status-register repeat mode 
bit [ST(RM)], indicating the PC is to be updated in the repeat mode. The ad- 
dress of the next instruction +3 is loaded into the repeat-start address (RS) 
register. 


RE should be greater than or equal to RS (RE => RS). Otherwise, the code will 
not repeat, even though the RM bit remains set to 1. 


RPTBD does not flush the pipeline. The three instructions following RPTBD 
are executed and should not modify the program flow. These three instructions 
are not part of the block that is repeated. The RC register must be loaded be- 
fore the RPTBD instruction executes. It should not be loaded in the three 
instructions after RPTBD. 


Interrupts are disabled during the next three instructions after RPTBD. 


Status Bits 


Mode Bit 
Cycles 


Example 


LUF 
LV 
UF 
N 

Z 

V 

Cc 


Unaffected 
Unaffected 
Unaffected 
Unaffected 
Unaffected 
Unaffected 
Unaffected 


Repeat Block Delayed RPTBD 


OVM operation is not affected by OVM bit value. 


1 


None 


Assembly Language Instructions 


14-225 


RPTS Repeat Single 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


14-226 


RPTS src 


src: general-addressing modes (G) 


24 23 1615 87 0 


31 


G src addressing modes 


00 register 


01 direct 
10 indirect 
11 immediate 


src > RC 

1 —> ST (RM) 
13S 

Next PC — RS 
Next PC — RE 


The RPTS instruction allows a single instruction to be repeated a number of 
times without any penalty for looping. Fetches also can be made from the in- 
struction register (IR), thus avoiding repeated memory access. 


The src operand is loaded into the repeat counter (RC). A 1 is written into the 
repeat mode (RM) bit of the status register (ST). A 1 also is written into the re- 
peat single bit (S). This indicates that the program fetches are to be performed 
only from the instruction register. The next PC is loaded into the repeat-end 
address (RE) register and the repeat-start address (RS) register. 


For the immediate mode, the src operand is assumed to be an unsigned inte- 
ger and is not sign-extended. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
4 


Repeat Single RPTS 


Example RPTS AR5 
Before Instruction After Instruction 

PC PC 
ST ST 
RS RS 
RE RE 
RC RC 

[| OFF] [| OFF | 


The RPTS instruction is not interruptable. Interrupts are held 
pending until the RPTS instruction is finished executing. In 
timing-critical applications, this could cause timings to be 
inaccurate; thus, in timing-critical applications, use caution when 
using the RPTS instruction. 


Assembly Language Instructions 14-227 


RSQRF Reciprocal of Square Root Floating-Point Value 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-228 


RSQPRF src, dst 


src: _extended-precision register, direct-, and indirect addressing modes 
dst: extended-precision register 


31 24 23 1615 87 0 


G src addressing modes 

00 extended-precision register 
01 direct 

10 indirect 

11 16-bit immediate 


16-bit reciprocal of the square root of src > dst 


The 16-bit approximation of the reciprocal of the square root of the number at 
the src operand is loaded into the dst register. The number at the src operand 
is assumed to be positive. The operation for negative inputs is undefined. 


The value at the dst and src operands are assumed to be floating-point num- 
bers. 


LUF Unchanged 
LV 1 if input is zero unchanged otherwise 


UF 0 
N 0 
Z 0 
Vv 1 if input is zero, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Signal, Interlocked SIGI 


SIGI src, dst 


src: _ direct- and indirect-addressing modes (assumed to be signed integer) 
dst: register mode (assumed to be signed integer) 


31 24 23 1615 8 7 0 


0001031100; G dst src 


G src addressing modes 
01 direct 
10 indirect 


LOCK (or LLOCK) pin brought low 
src > dst 
LOCK (or LLOCK) pin brought high 


Aninterlocking operation is signaled by the appropriate bus-lock signal (LOCK 
or LLOCK) if, and only if, an external-memory access is performed. The src 
and dst operands are assumed to be signed integers. After the read is per- 
formed, the bus-lock signal is deasserted. If an internal-memory access is per- 
formed, SIGI performs the read but does not assert a bus-lock signal. Refer 
to Section 9.7 on page 9-39 for a detailed description. 


The numbers at the src and dst operands are treated as signed integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SGETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-229 


STF Store Floating-Point Value 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-230 


STF src, dst 


src: register (RO — R11) 
dst: general-addressing modes (G) 


31 24 23 1615 87 0 


G src addressing modes 
01 direct 
10 indirect 


src > dst 


The src register is loaded into the dst memory location. The src and dst oper- 
ands are assumed to be floating-point numbers. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
Vv Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
1 


Store Floating-Point Value STF 


Example STF R2,@98Al1h 
Before Instruction After Instruction 

DP DP 
R2 [| 052. C501 900h] 4.30782204e + 01 R2 | 052 C501 900h | 4.30782204e + 01 
Data at 80 98A1h Data at 80 98Ath 

4.30782204e + 01 
LUF[ LUF[ 
vl el | 
UF Lo UF LO] 
eS) N Li] 
2 — z [Lo] 
2 i) en] 
c Lo ¢. [0] 


Assembly Language Instructions 14-231 


STFI Store Floating-Point Value, Interlocked 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-232 


STFI src, dst 


src: register (RO — R11) 
dst. general-addressing modes (G) 


24 23 1615 87 0 


31 


G src addressing modes 
01 direct 
10 indirect 


src > dst 
Signal end of interlocked operation. 


The src register is loaded into the dst memory location. An interlocked opera- 
tion is signaled over LOCK or LLOCK. The src and dst operands are assumed 
to be floating-point numbers. Refer to Section 9.7 on page 9-39 for detailed 
information. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


Example 


STFI  R3,*-AR4 


Store Floating-Point Value, Interlocked STF 


Before Instruction 


R3 |_07 33C0 0000h 
AR4 80 993Ch 


Data at 80 993Bh 
Oh 


Lr 


1.79750e + 02 


After Instruction 


R3 [07 33C0 0000h 1.79750e + 02 


AR4|____80 993Ch| 
Data at 80 993Bh 
1.79750e + 02 

LuF[_ 
wv Lol 
uF Lo] 
N Lo] 
z [| 
st] 
es 20] 


Assembly Language Instructions 


14-233 


STFI|STF Parallel Store Floating-Point Value 


Syntax STF src2, dsi2 
|| STF src1, dst? 
Operands src1; register (Rn1,0<n1</7) 


dst1: indirect (disp = 0, 1, IRO, IR1) 
src2: register (Rn2, 0 < n2< 7) 
dsi2: indirect (disp = 0, 1, IRO, IR1) 


Opcode 
31 24 23 1615 87 0 
Word Fields None 
Operation src2 > dst2 
|| src1 > dst1 
Description Two STF instructions are executed in parallel. Both src7 and src2 are assumed 
to be floating-point numbers. 
Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
Vv Unaffected 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example STF  R4,*AR3-—- 


14-234 


|| STF R3, *++AR5 
Before Instruction 


R4 1.4050e + 02 
AR3 
R3 1.79750e + 02 
ARS 


Data at 80 9835h 
Oh 


Data at 80 99D3h 
Oh 


Lur OT 


After Instruction 


R4 1.4050e + 02 
AR3 
R3 1.79750e + 02 
ARS 
Data at 80 9835h 

1.4050e + 02 
Data at 80 99D3h 

1.79750e + 02 
Lur[_____0] 
w [9] 
UF [ 
Ney [1] 
z Lo] 
v Lo] 
2 ns | 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Store Integer STI 


STI src, dst 


src: register (any register in CPU primary-register file) 
dst: general-addressing modes (G) 


31 24 23 1615 8 7 0 


foool10 1010/ G src dst 


G src addressing modes 
01 direct 
10 indirect 


src > dst 


The src register is loaded into the dst memory location. The src and dst oper- 
ands are assumed to be signed integers. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


STI R4,@982Bh 


Before Instruction After Instruction 

DP DP 
R4 273 367 R4 273 367 
Data at 80 982Bh Data at 80 982Bh 

58 876 273 367 
LuF[__ LuF[_ 
a es wf] 
UPL UF [0 
N Lo N [| 
zt z L___9) 
v Lo ‘nn 
c Lo c Lol 


Assembly Language Instructions 14-235 


STII Store Integer, interlocked 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


14-236 


STII src, dst 


src: register (any register in CPU primary-register file) 
dst. general-addressing modes (G) 


24 23 1615 87 0 


31 


G src addressing modes 
01 direct 
10 indirect 


src > dst 
Signal end of interlocked operation. 


The src register is loaded into the dst memory location. An interlocked opera- 
tion is signaled over LOCK or LLOCK. The src and dst operands are assumed 
to be signed integers. Refer to Section 9.7 on page 9-39 for detailed informa- 
tion. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
{ 


STII R1,@98AEh 


Before Instruction After Instruction 
DP DP 80h 
R1 Rt 
Data at 80 98AEh Data at 80 98AEh 
25Ch 


Parallel STl and ST! STI||STI 


Syntax STI src2, dst2 
|| STI src7, dst? 


Operands src1: register 
dst1: indirect 
src2: register 
dst2: indirect 


Rn1, 0<n1 <7) 
disp = 0, 1, IRO, IR1) 
Rn2, 0 <n2<7) 

disp = 0, 1, IRO, IR1) 


ae 


Opcode 
31 24 23 1615 87 0 
fapooo os] sz jooo)w | am | a2 
Word Fields None 
Operation src2 > dst2 
|| src? — ast1 
Description Two integer stores are performed in parallel. If both stores are executed to the 
same address, the value written is that of STI src2, dst2. 
Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Zz Unaffected 
V Unaffected 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 


Assembly Language Instructions 14-237 


STI||STI 


Example 


14-238 


Parallel STI and STI 


STI RO, *++AR2(IRO) 


| |STI R5, *ARO 


Before Instruction 


RO 
AR2|____ 80 9830h| 
IRO 
R5 
ARO|___ 80 98D3h| 


Data at 80 9838h 
Oh 


Data at 80 98D3h 
Oh 


LuFL___o 


220 


53 


After Instruction 


RO 
AR2|___80 9838h | 
IRO 
R5 
ARO|___ 80 98D3h | 


Data at 80 9838h 
ODCh 


Data at 80 98D3h 
35h 


LuFL__ 


220 


53 


220 


53 


Store Integer Immediate Value STIK 


Syntax STIK src, dst 
Operands src; 5-bit signed integer 
dst: direct and indirect mode 
Opcode 
31 24 23 1615 87 0 
Word Fields 
G src addressing modes 
00 direct 
11 indirect 
Operation src > dst 
Description The 5-bit signed integer src value is loaded into the dst memory location. The 
src and dst operands are assumed to be signed integers. 
Status Bits LUF Unaffected 
LV __ Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 
Example None 


Assembly Language Instructions 14-239 


SUBB | Subtract Integer With Borrow 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-240 


SUBB src, dst 


src: general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


24 23 1615 87 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst-— src—C > dst 


The difference of the dst, src, and C operands, as calculated above, is loaded 
into the dst register. The dstand src operands are assumed to be signed inte- 
gers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
1 


Subtract Integer With Borrow SUBB 


Example SUBB *AR5++(4),R5 
Before Instruction After Instruction 

ARS5|____ 80. 9800h] ARS|____ 80 9804h | 
R5 250 R5 50 
Data at 80 9800h Data at 80 9800h 

199 100 
LUFL__ LUFL__ 
vl nT | 
UPL UF Lo 
|| N Lo] 
Zz [0] i: eee | 
v Lo vi 6] 
C i eT || 


Assembly Language Instructions 14-241 


SUBB3 Subtract Integer With Borrow, 3 Operands 


Syntax 
Operands 


Opcode 


Word Fields 
Type 1 


Type 2 


Operation 


14-242 


SUBB3 src2, src1, dst 


src1, src2: type 1 or type 2 three-operand addressing modes 


dst: register mode (any register in CPU primary-register file) 
Type 1 

31 24 23 1615 87 

Type 2 

31 24 23 1615 87 


oo1101100{ T] ast 


00 


01 


src? addressing modes 

register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 
register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 


src? addressing modes 
register mode (any CPU register) 


register mode (any CPU register) 


indirect mode *+ARn(5-bit unsigned 
displacement) 

indirect mode *+ARn1(5-bit unsigned 
displacement) 


sre1 —src2—C > dst 


src2 addressing modes 

register mode (any CPU register) 
register mode (any CPU register) 
disp = 0, 1, IRO, IR1) 


indirect mode ( 
indirect mode (disp = 0, 1, IRO, IR1) 


src2 addressing modes 

8-bit signed immediate 

indirect mode *+ARn(5-bit unsigned 
displacement) 

8-bit signed immediate 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Subtract Integer With Borrow, 3 Operands SUBB3 


The difference of the src? and src2 operands and the C (carry) flag is loaded 
into the dst register. The src?, src2, and dst operands are assumed to be 
signed integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow is generated, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-243 


SUBC | Subtract Integer Conditionally 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-244 


SUBC src, dst 


src: general-addressing modes (G) 
dst: — register (any register in CPU primary-register file) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


If (dst— src= 0): 

(dst— src << 1) OR 1 —- ast 
Else: 

dst << 1 > dst 


The src operand is subtracted from the dstoperand. The dstoperand is loaded 
with a value that depends upon the result of the subtraction. If (dst— src) is 
greater than or equal to zero, then (dst- src) is left-shifted one bit, the least-sig- 
nificant bit is set to 1, and the result is loaded into the dstregister. If (dst — src) 
is less than zero, dstis left-shifted one bit and loaded into the dst register. The 
dst and src operands are assumed to be unsigned integers. 


SUBC can be used to perform a single step of a multibit-integer division. 


LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 


{ 


Example 1 


Example 2 


SUBC @98C5h,R1 


Before Instruction 


DP 
Rt 04F6h 


Data at 80 98C5h 


492h 
wuFL 9] 


SUBC 3000,RO (3000 
Before Instruction 


RO |____O7DOh| 
LuF[__ 
iv Lo 
UPL 
N Lo 
z [a] 
v Lo 
Cts so] 


1270 


1170 


OBB8h) 


2000 


After Instruction 


DP 
Rt OC9h 


Data at 80 98C5h 


492h 
LuFL__}] 


After Instruction 


RO [_____OFAOh| 
LuF[___ 0] 
Ww Lo] 
UF [0] 
N Lo] 
ZL 5] 
1 =) 
Cet sso 


Assembly Language Instructions 


Subtract Integer Conditionally SUBC 


201 


1170 


4000 


14-245 


SUBF Subtract Floating-Point Value 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-246 


SUBF src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


24 23 1615 87 0 


31 


G src addressing modes 
00 register (RO-R11) 

01 direct 

10 indirect 

11 immediate 


dst— src > dst 


The result of the dstoperand minus the srcoperand is loaded into the dst regis- 
ter. The dst and src operands are assumed to be floating-point numbers. 


LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if an floating-point overflow occurs, unchanged otherwise 
UF 1 ifa floating-point underflow occurs, 0 otherwise 

N 1 if a negative result is generated, 0 otherwise 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Subtract Floating-Point Value SUBF 


Example SUBF *ARO--(IRO),R5 
Before Instruction After Instruction 

ARO ARO 
IRO IRO 
R5 1.79750000e + 02 R5 3.9250e + 01 
Data at 80 9888h Data at 80 9888h 

1.4050e + 02 1.4050e + 02 
Cs a) Lur[——=* 
vt of w Lo] 
UF [ UF [OO 
Nn [0] N [oo] 
z Lo z Lo] 
v [9] v [oo] 
c Lo c [Lol 


Assembly Language Instructions 14-247 


SUBF3 = Subtract Floating-Point Value, 3 Operands 


Syntax 
Operands 


Opcode 


Word Fields 
Type 1 


Type 2 


Operation 


14-248 


SUBFS3 src2, src1, dst 


src1, src2:type 1 or type 2 three-operand addressing modes 


ast: register mode (RO — R11) 

Type 1 

31 24 23 1615 87 
COTO 14 04] T:). ost srct src2 
Type 2 

31 24 23 1615 87 


CO Loa to 1 T | ae 


01 


11 


src? addressing modes 

register mode (RO-R11) 

indirect mode (disp = 0, 1, IRO, IR1) 
register mode (RO-R11) 

indirect mode (disp = 0, 1, IRO, IR1) 


src? addressing modes 
register mode (any CPU register) 


indirect mode *+ARn1(5-bit unsigned 
displacement) 


src? — src2 => dst 


src2 addressing modes 

register mode (RO-R11) 

register mode (RO-R11) 

indirect mode (disp = 0, 1, IRO, IR1) 
( 


indirect mode (disp = 0, 1, IRO, IR1) 


src2 addressing modes 


indirect mode *+ARn(5-bit unsigned 
displacement) 

indirect mode *+ARn2(5-bit unsigned 
displacement) 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Subtract Floating-Point Value, 3 Operands SUBF3 


The difference of the src? and src2 operands is loaded into the dstregister. The 
src1, src2, and dst operands are assumed to be floating-point numbers. 


LUF 1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if an floating-point overflow occurs, unchanged otherwise 
UF (1 ifa floating-point underflow occurs, 0 otherwise 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-249 


SUBF3||STF Parallel SUBF3 and STF 


Syntax SUBF3 src1, src2, dst1 
|| STF src3, dst2 


Operands src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO — R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


Opcode 
24 23 1615 87 0 


31 
01 01] astt | srt | sre3 dsi2 sre2 


Word Fields None 


Operation src2— src1 > dst1 
|| src3— dst2 


Description A floating-point subtraction and a floating-point store are performed in parallel. 
All registers are read at the beginning and loaded at the end of the execute 
cycle. This means that if one of the parallel operations (STF) reads from a reg- 
ister and the operation being performed in parallel (SUBF3) writes to the same 
register, then STF accepts as input the contents of the register before it is mo- 
dified by the SUBFS. 


If src3 and dst? point to the same location, src3is read before the write to dsf7. 


Status Bits LUF 
LV 
UF 
N 


if a floating-point underflow occurs, unchanged otherwise 
if an floating-point overflow occurs, unchanged otherwise 
if a floating-point underflow occurs, 0 otherwise 

if a negative result is generated, 0 otherwise 

Z if a zero result is generated, 0 otherwise 

Vv if an floating-point overflow occurs, 0 otherwise 

Cc Unaffected 


a ro Cor Cao 


4 


Mode Bit OVM operation is not affected by OVM bit value. 


Cycles 1 


14-250 


Example 


SUBF3 R1,*-AR4(IR1),R0 


|| STE 


R7, *+AR5 (IRO) 


Before Instruction 


Rt 
AR4|____80 98B8h| 
IR1 
RO 
R7 
AR5[____80.9850h] 
IRO 


Data at 80 98BOh 


70C 8000h 


Data at 80 9860h 
Oh 


a] 


6.28125e + 01 


1.79750e + 02 


1.4050e + 02 


Parallel SUBF3 and STF SUBF3||STF 


After Instruction 


Rt 
AR4|___80 98B8h | 
IR1 
RO 
R7 


AR5|____80.9850h | 
IRO 
Data at 80 98BOh 

Data at 80 9860h 

Lur 9] 
w [9 
ur [0] 
n 0] 
er 
v Col 
¢ Eo] 


6.28125e + 01 


7.768750e + 01 
1.79750e + 02 


1.4050e + 02 


1.79750e + 02 


Assembly Language Instructions 14-251 


SUBI = Subtract Integer 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-252 


SUBI src, dst 


src: general-addressing modes (G) 
dst: _ register (any register in CPU primary-register file) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst— src > dst 


The result of the dstoperand minus the src operand is loaded into the dstregis- 
ter. The dst and src operands are assumed to be signed integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


Subtract Integer With Borrow SUBI 


Example SUBI 220,R7 
Before Instruction After Instruction 
R7 550 R7 330 
LWF[__ LUF[_ 
vl w Lo 
U[_ UF [0 
N [0] N [0] 
Z[. _____10] z CL] 
v ~ Oo er | 
c [oO c Lol 


Assembly Language Instructions 14-253 


SUBI3 = Subtract Integer, 3 Operands 


Syntax 
Operands 


Opcode 


Word Fields 


Type 1 


Type 2 


Operation 


14-254 


SUBIS3 src2, src1, dst 


src1, src2:type 1 or type 2 three-operand addressing modes 


dst: register mode (any register in CPU primary-register file) 

Type 1 

31 24 23 1615 87 0 
Pe ee ee srt sro2 
Type 2 

31 24 23 1615 87 0 


Con Ata OTe | ast 


00 


01 


src? addressing modes 

register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 
register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 


src? addressing modes 
register mode (any CPU register) 


register mode (any CPU register) 


indirect mode *+ARn(5-bit unsigned 
displacement) 


indirect mode *+ARn1(5-bit unsigned 
displacement) 


srce1 — src2 => dst 


src2 addressing modes 

register mode (any CPU register) 
register mode (any CPU register) 
disp = 0, 1, IRO, IR1) 
disp = 0, 1, IRO, IR1) 


indirect mode ( 
( 


indirect mode 


src2 addressing modes 

8-bit signed immediate 

indirect mode *+ARn(5-bit unsigned 
displacement) 

8-bit signed immediate 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Subtract Integer, 3 Operands SUBI3 


The result of the src? operand minus the src2 operand is loaded into the dst 
register. The src7, src2, and dst operands are assumed to be signed integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow is generated, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-255 


SUBI3||STI  Parallei SUBI3 and STI 


Syntax SUBI3  src1, src2, dst1 
|| STI src3, dst2 


Operands src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO — R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


Opcode 
31 24 23 1615 87 0 
Word Fields None 
Operation src2 —src1 — dst1 
|| src3 > dst2 
Description An integer subtraction and an integer store are performed in parallel. All regis- 
ters are read at the beginning and loaded at the end of the execute cycle. This 
means that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (SUBI3) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by the 
SUBIS. 
If src3 and dst? point to the same location, src3is read before the write to dsf7. 
Status Bits LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
Vv 1 if an integer overflow occurs, 0 otherwise 
Cc 1 if a borrow occurs, 0 otherwise 
Mode Bit OVM operation is affected by OVM bit value. 
Cycles 1 


14-256 


Example 


SUBI3 R7,*+AR2(IRO),R1 


|| STI R3, *++AR7 
Before Instruction 


R7 
AR2|____ 80 982Fh| 
IRO 
Rt 
R3 
AR7[____80 983Bh] 


Data at 80 983Fh 
ODCh 


Data at 80 983Ch 
Oh 


luF[__ 


20 


53 


220 


Parallel SUBI3 and STI 


SUBI3||STI 


After Instruction 


R7 
AR2|___ 80 982Fh | 
IRO 
Rt 
R3 
AR7[____80 983Ch| 


Data at 80 983Fh 
ODCh 


Data at 80 983Ch 
35h 


LuFL__ 


20 


200 
53 


220 


53 


Assembly Language Instructions 14-257 


SUBRB Subtract Reverse Integer With Borrow 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-258 


SUBRB src, dst 


src: general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


24 23 1615 87 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


src — dst— C > dst 


The difference of the src, dst, and C operands, as calculated above, is loaded 
into the dst register. The dstand src operands are assumed to be signed inte- 
gers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
1 


Subtract Reverse Integer With Borrow SUBRB 


Example SUBRB R4,R6 
Before Instruction After Instruction 
R4 971 R4 971 
R6 600 R6 370 
LuF[_ LuF[ 
eT) wv [oO 
UF [ UF [OO 
N [0] N [0] 
z Cid 2 ee | 
v (9 v [__] 
C c Lol 


Assembly Language Instructions 14-259 


SUBRF Subtract Reverse Floating-Point Values 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-260 


SUBRF src, dst 


src: general-addressing modes (G) 
dst: register (RO — R11) 


31 24 23 1615 87 0 


G src addressing modes 
00 register (RO-R11) 

01 direct 

10 indirect 

11 immediate 


src — dst > dst 


The result of the src operand minus the dstoperand is loaded into the dstregis- 
ter. The dst and src operands are assumed to be floating-point numbers. 


LUF (1 if a floating-point underflow occurs, unchanged otherwise 
LV 1 if. floating-point overflow occurs, unchanged otherwise 
UF 1 ifa floating-point underflow occurs, 0 otherwise 

N 1 if a negative result is generated, 0 otherwise 

Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if a floating-point overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Subtract Reverse Floating-Point Values SUBRF 


Example SUBRF @9905h,R5 
Before Instruction After Instruction 

DP DP 
R5 6.281250e + 01 R5 [| 06 69E0 0000h | 1.16937500e + 02 
Data at 80 9905h Data at 80 9905h 

1.79750e + 02 1.79750¢ + 02 
LuF-_ iO LUFT 
vod w lo] 
UF L_0 | 
LS 9) NE ———— =o] 
z {TD z [Co] 
i es v LO] 
c Lo G- [= 0] 


Assemlby Language Instructions 14-261 


SUBRI Subtract Reverse Integer 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-262 


SUBRI src, dst 


src: general-addressing modes (G) 
dst: _ register (any register in CPU primary-register file) 


24 23 1615 87 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


src — dst > dst 


The result of the src operand minus the dstoperand is loaded into the d'stregis- 
ter. The dst and src operands are assumed to be signed integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV 1 if an integer overflow occurs, unchanged otherwise 
UF 0 


N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

Vv 1 if an integer overflow occurs, 0 otherwise 

Cc 1 if a borrow occurs, 0 otherwise 


OVM operation is affected by OVM bit value. 
{ 


Example 


SUBRI *AR5++(IRO),R3 
Before Instruction 


ARS 80 9900h 
IRO 
R3 ODCh 


Data at 80 9900h 
226h 


LuF[__ 


220 


550 


Subtract Reverse Integer 


After Instruction 


AR5 80 9908h 
IRO 
R3 O14Ah 


Data at 80 9900h 
226h 


LuFL__ 


Assembly Language Instructions 


SUBRI 


330 


550 


14-263 


SWI = Software Interrupt 


Syntax SWI 
Operands None 
Opcode 
31 24 23 1615 87 0 


011001100]/00;000000000000000000000 


Word Fields None 
Operation Performs an emulation interrupt 


Description The SWI instruction performs an emulator interrupt. This is a reserved instruc- 
tion and should not be used in normal programming. 


Status Bits LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 4 
Example None 


14-264 


Syntax 


Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Convert to IEEE Format TOIEEE 


TOIEEE - src, dst 


src: extended-precision register (RO — R11), 
direct- and indirect-addressing modes 
dst: extended-precision register 


31 24 23 1615 8 7 0 


0001101141) G dst src 


G src addressing modes 

00 register [extended-precision 
register (RO-R11)] 

01 direct 

10 indirect 

11 immediate 


convert src to IEEE format > dst 


The src operand is converted from a 2s-complement floating-point format to 
the IEEE floating-point format. 


The srcoperand is assumed to be a single-precision floating-point number, ex- 
cept for the immediate mode that is considered a short 16-bit floating point for- 
mat. The converted result goes into the 32 MSBs of the dsf register. STF can 
be used to store the result to memory. 


LUF Unaffected 
LV 1 if an overflow occurs, unchanged otherwise 


UF 0 

N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 

V 1 if an overflow occurs, 0 otherwise 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 


Assembly Language Instructions 14-265 


TOIEEE||STF Parallel TOIEEE and STF 


Syntax TOIEEE src2, dst1 
|| STF src3, dst2 


Operands src2: indirect mode (disp = 0, 1, IRO, IR1) 
dst?1: register mode (Rn1, 0 <n1 <7) 
src3: register mode (Rn1, 0 <n1 <7) 
dst2: indirect mode (disp = 0, 1, IRO, IR1) 


Opcode 
31 24 7 0 


23 1615 8 


Word Fields None 


Operation convert src2 to IEEE format > dst1 
in parallel with 
src3 dst2 


Description The src2 operand is converted from a 2s-complement floating-point format to 
the IEEE floating-point format. 


The src2 operand is assumed to be a single-precision floating-point number. 
The converted result goes into the 32 MSBs of the dst7 register. A floating- 
point store is done in parallel. 


If src2 and dst2 point to the same location, then src2 is read before the write 
to dsi2. 


Status Bits LUF Unaffected 
LV 1 if an overflow occurs, unchanged otherwise 
UF 0 
N 1 if a negative result is generated, 0 otherwise 
Z 1 if a zero result is generated, 0 otherwise 
V 1 if an overflow occurs, 0 otherwise 
Cc Unaffected 


Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 


Example None 


14-266 


Syntax 
Operands 
Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Trap Conditionally TRAPcond 


TRAPcond N 
N: immediate mode (0 < N < 511) 


31 24 23 1615 87 0 
011101 0/0000/] cond |0000000 N 
None 


If (cond is true) 


ST(GIE) — ST(PGIE) 
ST(CF) > ST(PCF) 
0 > ST(GIE) 

1 > ST(CF) 

next PC > *(++SP) 
trap vector N > PC 


Else, continue. 


If the condition is true, then GIE and CF are saved in PGIE and PCF in the sta- 
tus register, all interrups are disabled (0 > GIE), and the cache is frozen (1 > 
CF). Then, the contents of the PC are pushed onto the system stack, and the 
PC is loaded with the contents of the specified trap vector (N). If the condition 
is not true, then continue normal operation. 


If traps are to be nested, you may need to save the status register before ex- 
ecuting TRAP cond. 


GIE Set to 0 if TRAP executes 
LUF Unaffected 
LV Unaffected 
UF Unaffected 
N Unaffected 
Z Unaffected 
V Unaffected 
Cc Unaffected 


OVM operation is not affected by OVM bit value. 
5 


None 
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TSTB | Test Bit Fields 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-268 


TSTB src, dst 


src: general-addressing modes (G) 
dst: — register (any register in CPU primary-register file) 


31 24 23 1615 87 0 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst AND src 


The bitwise-logical AND of the dst and src operands is formed, but the result 
is not loaded in any register. This allows for nondestructive compares. The dst 
and src operands are assumed to be unsigned integers. 


These condition flags are modified for all destination registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
1 


Test Bit Fields TSTB 


Example TSTB *-AR4(1),R5 
Before Instruction After Instruction 

AR4|_____80 99C5h| AR4|____80 99C5h | 
R5 2200 R5 2200 
Data at 80 99C4h Data at 80 99C4h 

1895 1895 
ur ___—=* \uF___—_— 
vf wv Lo] 
UL uF [Lo 
N [Lo N [Lo] 
Zz [20] Z 
a rs ‘i. | 
c Lo c Lo] 
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TSTB3 Test Bit Fields, 3 Operands 


Syntax 
Operands 


Opcode 


Word Fields 
Type 1 


Type 2 


Operation 


Description 


14-270 


TSTB3 src2, srct 


srce1, src2: 


Type 1 


31 


24 23 


Type 2 


31 


T  src1 addressing modes src2 addressing modes 

00 register mode (any CPU register) register mode (any CPU register) 

01 indirect mode (disp = 0, 1, IRO,IR1) register mode (any CPU register) 

10 register mode (any CPU register) indirect mode (disp = 0, 1, IRO, IR1) 

11 indirect mode (disp = 0,1, !IR0,!IR1) — indirect mode (disp = 0, 1, IRO, IR1) 

T  src1 addressing modes src2 addressing modes 

00 register mode (any CPU register) 8-bit signed immediate 

01 register mode (any CPU register) alee mode “2 Arn (S-BiLunisigned 
displacement) 

10 indirect mode *+ARn(5-bit unsigned 8-bit signed immediate 


24 23 


displacement) 


indirect mode *+ARn1(5-bit unsigned 
displacement) 


src? AND src2 


type 1 or type 2 three-operand addressing modes 


1615 8 


7 0 


1615 8 7 0 


Ort Ade tt hh | O00 6 


src1 src2 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


The bitwise-logical AND between the src7 and src2 operands is performed but 
is not loaded into any register. This allows for nondestructive compares. The 
src1 and src2 operands are assumed to be unsigned integers. The src2 im- 
mediate-addressing mode is sign-extended. 


Although this instruction has only two operands, it is designated as a three-op- 
erand instruction because operands are specified in the three-operand format. 


Status Bits 


Mode Bit 
Cycles 


Example 


Test Bit Fields, 3 Operands TSTB3 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 
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XOR_Bitwise Exclusive OR 


Syntax 
Operands 


Opcode 


Word Fields 


Operation 


Description 


Status Bits 


Mode Bit 


Cycles 


14-272 


XOR src, dst 


src: general-addressing modes (G) 
dst: register (any register in CPU primary-register file) 


24 23 1615 87 0 


31 


G src addressing modes 

00 register (any register in 
CPU primary-register file) 

01 direct 

10 indirect 

11 immediate 


dst XOR src > dst 


The bitwise-exclusive OR of the src and dst operands is loaded into the dst 
register. The dst and src operands are assumed to be unsigned integers. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


Bitwise Exclusive OR XOR 


Example XOR R1,R2 
Before Instruction After Instruction 
Rt Rt 
R2 R2 
LuFL__ LuFL__ 
vl wv Lo 
ur-L__ uF Lo 
Nn Lo N Lo] 
2. [2 Ze (| 
2, 6) 
e Lg c [ol 
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XOR3 | Bitwise Exclusive OR, 3 Operands 


Syntax 
Operands 


Opcode 


Word Fields 
Type 1 


Type 2 


Operation 


14-274 


XOR3 src2, src1, dst 


src1, src2:type 1 or type 2 three-operand addressing modes 


dst: register mode (any register in CPU primary-register file) 
Type 1 

31 24 23 1615 87 

Type 2 

31 24 23 1615 87 


00 


01 


src? addressing modes 

register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 
register mode (any CPU register) 
indirect mode (disp = 0, 1, IRO, IR1) 


src? addressing modes 
register mode (any CPU register) 


register mode (any CPU register) 


indirect mode *+ARn(5-bit unsigned 
displacement) 

indirect mode *+ARn1(5-bit unsigned 
displacement) 


src? XOR src2 > dst 


src2 addressing modes 

register mode (any CPU register) 
register mode (any CPU register) 
disp = 0, 1, IRO, IR1) 
disp = 0, 1, IRO, IR1) 


indirect mode ( 
indirect mode ( 
src2 addressing modes 

8-bit signed immediate 

indirect mode *+ARn(5-bit unsigned 
displacement) 


8-bit signed immediate 


indirect mode *+ARn2(5-bit unsigned 
displacement) 


Description 


Status Bits 


Mode Bit 
Cycles 


Example 


Bitwise Exclusive OR, 3 Operands XOR3 


The bitwise-exclusive OR between the src7 and src2 operands is loaded into 
the dstregister. The src7, src2, and dstoperands are assumed to be unsigned 
integers. The src2 immediate-addressing mode is sign-extended. 


If ST (SETCOND) = 0 and the destination register is RO — R11, the condition 
flags are modified. If ST (SETCOND) = 1, they are modified for all destination 
registers. 


LUF Unaffected 
LV Unaffected 


UF 0 

N MSB of the output 

Z 1 if a zero output is generated, 0 otherwise 
Vv 0 


Cc Unaffected 
OVM operation is not affected by OVM bit value. 
{ 


None 
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XOR3||STI Parallel XOR3 and STI 


Syntax XOR3 © src2, src1, dst 
|| STI src3, dst2 


Operands src1: register (RO — R7) 
src2: indirect (disp = 0, 1, IRO, IR1) 
dst1: register (RO — R7) 
src3: register (RO — R7) 
dst2: indirect (disp = 0, 1, IRO, IR1) 


Opcode 
31 24 23 1615 87 0 
Word Fields None 
Operation src1 XOR src2 > dst1 
|| src3— dst2 
Description Abitwise-exclusive XOR and an integer store are performed in parallel. All reg- 
isters are read at the beginning and loaded at the end of the execute cycle. This 
means that if one of the parallel operations (STI) reads from a register and the 
operation being performed in parallel (XOR3) writes to the same register, then 
STI accepts as input the contents of the register before it is modified by the 
XOR3. 
If src2 and dst2 point to the same location, src2is read before the write to dsi2. 
Status Bits LUF Unaffected 
LV Unaffected 
UF 0 
N MSB of the output 
Z 1 if a zero output is generated, 0 otherwise 
Vv 0 
Cc Unaffected 
Mode Bit OVM operation is not affected by OVM bit value. 
Cycles 1 


14-276 


Example 


XOR3 *AR1++,R3,R3 
|| STI R6, *-AR2 (IRO) 
Before Instruction 


AR1|____80 987Eh| 
R3 
R6 
AR2[____80 98B4h] 
IRO 


Data at 80 987Eh 


Data at 80 98ACh 
Oh 


luF[__ 


220 


Parallel XOR3 and STI 


After Instruction 


AR1 
R3 
R6 
AR2[____80 98B4h| 
IRO 


Data at 987Eh 
85h 


Data at 80 98ACh 
ODCh 


LuFL__ 


Assembly Language Instructions 


XOR3||STI 


220 


220 
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14-278 


PAN 0) of= Jato | yl 


Glossary 


AO-A30: External address pins for data/program memory or I/O devices. 
These pins are on the global bus. See also LAO-LA30. 


address: The location of program code or data stored in memory. 


addressing mode: The method by which an instruction interprets its oper- 
ands to acquire the data it needs. 


ALU: See Arithmetic logic unit. 


analog-to-digital (A/D) converter: A successive-approximation converter 
with internal sample-and-hold circuitry used to translate an analog signal 
to a digital signal. 


ARAU: See auxiliary-register arithmetic unit. 


arithmetic logic unit (ALU): The part of the CPU that performs arithmetic 
and logic operations. 


auxiliary registers (ARn): A set of registers used primarily in address gen- 
eration. 


auxiliary-register arithmetic unit (ARAU): Auxiliary-register arithmetic 
unit. A 32-bit arithmetic logic unit (ALU) used to calculate indirect ad- 
dresses using the auxiliary registers as inputs and outputs. 


bit-reversed addressing: Addressing in which several bits of an address 
are reversed in order to speed processing of algorithms, such as Fourier 
transforms. 


BK: See block-size register. 


A-1 


Glossary 


A-2 


bootloader: An on-chip code that transfers code from an external memory 
or from a communication port to RAM at power-up. 


carry bit: A bitin status register ST used by the ALU for extended arithmetic 
operations and accumulator shifts and rotates. The carry bit can be 
tested by conditional instructions. 


circular addressing: An addressing mode in which an auxiliary register is 
used to cycle through a range of addresses to create a circular buffer in 
memory. 


context save/restore: Asave/restore of system status (status registers, ac- 
cumulator, product register, temporary register, hardware stack, and 
auxiliary registers, etc.) when the device enters/exits a subroutine such 
as an interrupt service routine. 


CPU: Central processing unit. The unit that coordinates the functions of a 
processor. 


CPUcycle: The time ittakes the CPU to go through one logic phase (during 
which internal values are changed) and one latch phase (during which 
the values are held constant). 


cycle: See CPU cycle. 


DO-D31: External data-bus pins that transfer data between the processor 
and external data/program memory or I/O devices. See also LDO-LD31. 


data-address generation logic: Logic circuitry that generates the address- 
es for data-memory reads and writes. This circuitry can generate one ad- 
dress per machine cycle. See also program address generation logic. 


data-page pointer: A32-bitregister used as the 16 MSBs in addresses gen- 
erated using direct addressing. 


decode phase: Thephase ofthe pipeline in which the instruction is decoded 
(identified). 


DIE: See DMA interrupt enable register. 


DMAcoprocessor: Aperipheral that transfers the contents of memory loca- 
tions independently of the processor (except for initialization). 


Glossary 


DMA controller: See DMA coprocessor. 


DMA interrupt enable register (DIE): A register (in the CPU register file) 
that controls which interrupts the DMA coprocessor responds to. 


DP: See data-page pointer. 


dual-access RAM: Memory that can be accessed twice in a single clock 
cycle. For example, your code can read from and write to a dual-access 
RAM in one clock cycle. 


external interrupt: A hardware interrupt triggered by a pin. 


extended-precision floating-point format: A 40-bit representation of a 
floating-point number with a 32-bit mantissa and an 8-bit exponent. 


extended-precision register: A 40-bit register used primarily for extended- 
precision floating-point calculations. Floating-point operations use bits 
39-0 of an extended-precision register. Integer operations, however, use 
only bits 31-0. 


FIFO buffer: FFirst-in, first-out buffer. A portion of memory in which data is 
stored and then retrieved in the same order in which it was stored. Thus, 
the first word stored in this buffer is retrieved first. The ’C-4x’s communica- 
tion ports each have two FIFOs: one for transmit operations and one for 
receive operations. 


hardware interrupt: An interrupt triggered through physical connections 
with on-chip peripherals or external devices. 


hit: A condition in which, when the processor fetches an instruction, the 
instruction is available in the cache. 


IACK: = /nterrupt acknowledge signal. An output signal that indicates that an 
interrupt has been received and that the program counter is fetching the 
interrupt vector that will force the processor into an interrupt service rou- 
tine. 
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Glossary 


A-4 


IIE: See internal interrupt enable register. 
IF: See IIOF flag register. 


IIOF flag register (IIF): Controls the function (general-purpose I/O or inter- 
rupt) of the four external pins (IIOFO to IIOF3). It also contains timer/DMA 
interrupt flags. 


index registers: Two registers (IRO and IR1) that are used by the ARAU for 
indexing an address. 


internal interrupt: A hardware interrupt caused by an on-chip peripheral. 


internal interrupt enable register: A register (in the CPU register file) that 
determines whether or not the CPU will respond to interrupts from the 
communication ports, the timers, and the DMA coprocessor. 


interrupt: A signal sent to the CPU that (when not masked) forces the CPU 
into a subroutine called an interrupt service routine. This signal can be 
triggered by an external device, an on-chip peripheral, or an instruction 
(TRAP, for example). 


interrupt acknowledge (IACK): A signal that indicates that an interrupt has 
been received, and that the program counter is fetching the interrupt vec- 
tor location. 


interrupt vector table (IVT): An ordered list of addresses which each corre- 
spond to an interrupt; when an interrupt occurs and is enabled, the pro- 
cessor executes a branch to the address stored in the corresponding 
location in the interrupt vector table. 


interrupt vector table pointer (IVTP): A register (in the CPU expansion 
register file) that contains the address of the beginning of the interrupt 
vector table. 


ISR: /nterrupt service routine. A module of code that is executed in 
response to a hardware or software interrupt. 


IVTP: See interrupt vector table pointer. 


LAO-LA30: External address pins for data/program memory or I/O devices. 
These pins are on the local bus. See also AO—A30. 


LDO-LD31: External data bus pins that transfer data between the processor 
and external data/program memory or I/O devices. See also DO-D31. 


Glossary 


LSB: Least significant bit. The lowest order bit in a word. 


machine cycle: See CPU cycle. 


mantissa: A component of a floating-point number consisting of a fraction 
and a sign bit. The mantissa represents a normalized fraction whose 
binary point is shifted by the exponent. 


maskable interrupt: A hardware interrupt that can be enabled or disabled 
through software. 


memory-mapped register: One of the on-chip registers mapped to ad- 
dresses in memory. Some memory-mapped registers are mapped to 
data memory, and some are mapped to input/output memory. 


MFLOPS: Millions of floating point operations per second. A measure of 
floating-point processor speed that counts of the number of floating-point 
operations made per second. 


microcomputer mode: A mode in which the on-chip ROM (bootloader) is 
enabled. This mode is selected via the MP/MC pin. See also MP/MC pin; 
microprocessor mode. 


microprocessormode: A mode in which the on-chip ROM is disabled. This 
mode is selected via the MP/MC pin. See also MP/MC pin; microcomput- 
er mode. 


MIPS: Million instructions-per-second. 


miss: A condition in which, when the processor fetches an instruction, it is 
not available in the cache. 


MSB: Most significant bit. The highest order bit in a word. 


multiplier: A device that generates the product of two numbers. 


NMI: See Nonmaskable interrupt. 


nonmaskable interrupt (NMI): A hardware interrupt that uses the same 
logic as the maskable interrupts but cannot be masked. 
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overflow flag (OV) bit: A status bit that indicates whether or not an arithme- 
tic operation has exceeded the capacity of the corresponding register. 


PC: See program counter. 


peripheral bus: Abus thatis used by the CPU to communicate the DMA co- 
processor, communication ports, and timers. 


pipeline: A method of executing instructions in an assembly-line fashion. 


program counter: A register that contains the address of the next instruc- 
tion to be fetched. 


RC: See repeat counter register. 


read/write (R/W) pin: This memory-control signal indicates the direction of 
transfer when communicating to an external device. 


register file: A bank of registers. 


repeat counter register: A register (in the CPU register file) that specifies 
the number of times minus one that a block of code is to be repeated 
when a block repeat is performed. 


repeat mode: A zero-overhead method for repeating the execution of a 
block of code. 


reset: A means to bring the central processing unit (CPU) to a known state 
by setting the registers and control bits to predetermined values and 
signaling execution to fetch the reset vector. 


reset pin: This pin causes the device to reset. 


ROMEN: ROM enable. An external pin that determines whether or not the 
the on-chip ROM is enabled. 


R/W: See read/write pin. 


Glossary 


short floating-point format: A 16-bit representation of a floating point num- 
ber with a 12-bit mantissa and a 4-bit exponent. 


short integer format: A twos-complement, 16-bit format for integer data. 
short unsigned-integer format: A 16-bit unsigned format for integer data. 
sign-extend: Fill the high order bits of a number with the sign bit. 


single-precision floating-point format: A 32-bit representation of a float- 
ing-point number with a 24-bit mantissa and an 8-bit exponent. 


single-precision integer format: A twos-complement 32-bit format for in- 
teger data. 


single-precision unsigned-integer format: A 32-bit unsigned format for 
integer data. 


software interrupt: Aninterrupt caused by the execution of a TRAP instruc- 
tion. 


splitmode: Amodeof operation of the DMA coprocessor. This mode allows 
one DMA channel to service both the receive and transmit portions of a 
communication port. 


ST: See status register. 


stack: Ablockofmemory reserved for storing and retrieving data on afirst-in 
last-out basis. It is usually used for storing return addresses and for pre- 
serving register values. 


status register: A register in the CPU register file that contains global in- 
formation related to the CPU. 


Timer: A programmable peripheral that can generate pulses or time events. 


Timer-Period Register: Timer-period register. A 32-bit memory-mapped 
register that specifies the period for the on-chip timer. 


trap vector table (TVT): An ordered list of addresses which each corre- 
spond to an interrupt; when a trap is executed, the processor executes 
a branch to the address stored in the corresponding location in the trap 
vector table. 


trap vector table pointer (TVTP): A register in the CPU expansion register 
file that contains the address of the beginning of the trap vector table. 
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TVTP: See trap vector table pointer. 


unified mode: A mode of operation of the DMA coprocessor. The mode is 
used mainly for memory-to-memory transfers. This is the default mode 
of operation for a DMA channel. See also split mode. 


wait state: A period of time that the CPU must wait for external program, 
data, or I/O memory to respond when reading from or writing to that ex- 
ternal memory. The CPU waits one extra cycle for every wait state. 


wait-state generator: A program that can be modified to generate a limited 
number of wait states for a given off-chip memory space (lower program, 
upper program, data, or I/O). 


zero fill: Fill the low or high order bits with zeros when loading a number into 
a larger field. 


16-bit wide configured memory 
table 10-14 

32-bit wide configured memory 
table 10-15 


A/D converter 

definition A-1 
A0-A30 

definition A-1 
abbreviations 14-16 
ABS||STl instruction 14-31 
ABSF instruction 14-26 
ABSF||STF instruction 14-27 
ABSI instruction 14-29 
ADDC instruction 14-33 
ADDC3 instruction 14-35 
ADDF instruction 14-37 
ADDF3 instruction 14-39 
ADDF3||STF instruction 14-41 
ADDl instruction 14-43 
ADDI3 instruction 14-45 
ADDI3||STI instruction 14-47 
addition 

floating-point 5-23 
address 

definition A-1 
address buses 

external 2-20 
address partitioning 

figure 4-10 
address pins 

external A-4 


Index 


address range 
LSTRBO 9-11 
STRBO 9-10 


address space 
caution 2-13 


addressing modes 

bit-reversed addressing 6-32 

circular 6-27 

conditional branch 2-18 

definition A-1 

encoding 6-21 
conditional branch 6-25 
general 6-21 
parallel 6-24 
three-operand 6-22 

general 2-18 

groups 6-21 

parallel 2-18 

three operand 2-18, 6-22 


addressing types 6-2 
direct addressing 6-5 
immediate 6-18 
indirect addressing 6-6 to 6-21 
PC relative 6-19 
register 6-3 
AE bit 9-7 
aliasing 2-17 
ALU. See arithmetic logic unit; arithmetic logic unit 
(ALU) 
analysis bit 3-7 
analysis module 
registers 4-6 
AND instruction 14-49 
AND||STl instruction 14-53 
AND$3 instruction 14-51 
ANDN instruction 14-55 
ANDNS instruction 14-57 
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application(s) 
automotive _ viii, xiv 
consumer viii, xiv 
control _ viii, xii 
development support viii, xv 
general-purpose viii 
graphics/imagery _ viii, xi 
medical viii, xiv 
military viii, xiii 
multimedia _ viii, xiii 
speech/voice _ viii, xi 
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ARAUs. See auxiliary register arithmetic units 
(ARAUs) 


architectural overview 
introduction 2-1 


architecture 
peripheral bus 2-22 
arithmetic logic unit (ALU) 2-4 
definition A-1 
ASH instruction 14-59 
ASHG instruction 14-61 
ASH9||STI instruction 14-63 


assembly language instructions 14-2 to 14-11 
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flags 14-12 
example instruction 14-23 
illegal instructions 14-11 
interlocked operation 14-8 
loadand store 14-2 
parallel operation 14-9 
program control 14-7 
register symbols 14-21 to 14-22 
symbols 14-17 to 14-22 
syntax options 14-18 to 14-22 
three-operand 14-6 
two-operand 14-4 
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consecutive 11-40 
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synchronization 11-37 
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indirect 6-9 
auxiliary register arithmetic units (ARAUs) 2-6 
auxiliary registers (ARO-7) 2-6, 3-4 
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auxiliary registers (ARn) 
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auxiliary transfer-counter register 11-7 
auxiliary-register arithmetic unit (ARAU) 
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BcondAF instruction 7-11, 8-7, 14-67 
example 8-7 
BcondAT instruction 7-11, 8-7, 14-69 
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bit-reversed addressing 6-32 
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example 6-32 
index steps 6-33 
block diagram 
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peripheral modules 2-22 
timers 13-3 
block repeat 
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block transfer completion 11-6 
block transfer sequence 11-5 
bootloader 
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description 10-2 
from communication port 10-3 
from memory 10-3 
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operation 10-5 
setting the IIOF pins 10-19 
source code 10-20 to 10-25 
source structure 10-8 
bootloader mode selection 
table 10-3 
bootloading 
from acomm port 10-16 
from memory 10-10 
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BR instruction 14-73 
branch conflicts 8-4 


branch execution 
delayed 7-10 
branches 7-9, 7-12 
BRD instruction 14-74 
bus operation 
external 2-20, 9-1 to 9-50 
internal 2-19 
busy-waiting example 9-42 
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table 10-11 to 10-14 
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’C40/'C44 features 
table 1-4 
’C44 memory aliasing 
figure 2-17 
’C44 memory map 4-4 
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’C4x multiprocessor system 
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example 5-18 
’C4x-specific instructions 14-3 to 14-8 
cache 4-1 
cache clear (CC) bit 4-12 
cache enable (CE) bit 4-12 
cache freeze (CF) bit 4-12 
cache memory 2-11, 4-13 
architecture 2-11, 4-10 
control bits 4-12 
enabling 4-13 
hit 4-14 
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LRU algorithm 4-14 
miss 4-14 
segment miss 4-14 
subsegment miss 4-14 
CALL instruction 7-12, 14-75 
CALL response timing 
figure 7-14 
CALLcond instruction 7-12, 14-76 
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carry bit 
definition A-2 
CC bit 3-6, 4-12 
CE and CF bits 
combined effect 
table 4-13 
table 3-7 
CE bit 3-6, 4-12 
CEO bit 9-7 
CE1 bit 9-7 
central processing unit. See CPU 
CF bit 3-6, 4-12 
channel control register. See DMA channel control 
register 
channel priority scheme 
split mode 11-25 
circular addressing 
definition A-2 
example 6-30 
FIR filters 6-31 
register relationships 
figure 6-28 
circular addressing mode 6-27 
circular buffer 
implementation 6-29 
CLKSRC =0 and FUNC =0 13-14 
CLKSRC =0 and FUNC =1= 13-14 
CLKSRC = 1 andFUNC =0 13-13 
CMPF instruction 14-78 
CMPF3 instruction 14-80 
CMPI instruction 14-82 
CMPI3 instruction 14-84 
communication port load mode 
flow chart 10-7 
communication port memory map 
figure 12-7 
communication port reset 
example 12-10 
communication port software register 12-3 
communication ports 
arbitration unit 12-3, 12-11 
block diagram 12-4 
control register 12-3 
coordination with CPU/DMA 12-17 
CSTRB width restrictions 12-25 
features 2-23, 12-2 
H1/H3 synchronization 12-26 
input FIFO halt 12-15 
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communication ports (continued) 
input port post-reset state 12-31 
input port register 12-9 
interconnection 12-5 
introduction 12-1 
memory map 4-8, 12-7 
figure 4-8 
output FIFO halt 12-15 
output port post-reset state 12-30 
output port register 12-9 
reset 12-29 
tips 12-32 
token transfer 12-19 
communication-port control register (CPCR) 12-8 
field descriptions 12-8 
figure 12-8 
ICH 12-8 
INPUT LEVEL 12-9 
OCH 12-8 
OUTPUT LEVEL 12-9 
PORT DIR 12-8 
communication-port software 
reset address 
table 12-10 
reset register 12-10 
condition codes 
flags 14-14 
conditional-branch addressing modes 2-18, 6-25 
encoding 6-26 
consecutive autoinitializations 11-40 
consumer applications _ viii, xiv 
context save/restore 


definition A-2 
control applications viii, xii 
control bits 


repeat mode 7-3 
control registers 7-35, 11-7, 13-5 
conversion of format 
’C4x floating-point to integer 5-31 
extended-precision floating-point to single-preci- 
sion floating-point 5-12 
FRIEEE instruction 14-98 
IEEE single precision std. 754 5-13 
IEEE to ’C4x floating-point 5-14 
integer to floating-point 5-33 
short floating-point to extended-precision floating- 
point 5-11 
short floating-point to single-precision floating- 
point 5-11 
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single-precision ’C4x floating-point 5-13 

single-precision floating-point to extended-preci- 
sion floating-point 5-12 

TOIEEE instruction 14-265 


converting IEEE format 
table 5-14 


converting twos complement 
table 5-17 


counter register 13-5 


CPU 2-4 
arbitration 11-27 
block diagram 2-5 
buses 2-19 
components 2-4 
communication ports coordination 12-17 
definition A-2 
internal interrupt enable register (IIE) 2-9, 3-11 
primary register file 2-6 
CPU cycle 
definition A-2 
CPU expansion register file 
definition 3-1 
CPU primary register file 3-2 
definition 3-1 
CPU registers 2-7, 3-8, 7-36 
auxiliary (ARO-AR7) 2-6, 3-4 
block repeat (RC, RE, RS) 3-16 
block size (BK) 2-8, 3-5 
data page pointer (DP) 2-8, 3-4, 6-5 
DMA interrupt enable (DIE) 2-9, 3-8, 11-44 
expansion register file 2-10, 3-17 
extended precision (RO-R11) 2-6, 3-3 
HE 3-11 
IIOF flag register (IIF) 2-9, 3-13 
index (IR1, RO) 2-8, 3-4 
internal interrupt enable (IIE) 2-9, 3-11, 3-12 
introduction 3-1 
program counter (PC) 2-9, 2-19, 3-16 
repeat count (RC) 2-9, 3-16, 7-2 
repeat end address (RE) 3-16, 7-2 
See also repeat block (RC, RE, RS) 
repeat start address (RS) 3-16, 7-2 
See also repeat block (RC, RE, RS) 
stack pointer (SP) 2-8, 3-5 
status register (ST) 2-9, 3-5, 14-13 
table 3-2, 6-3 
timer 4-7 


CSTRB width restrictions 12-25 


DO-D31 
definition A-2 
data buses 
external 2-20 
data formats 
introduction 5-1 
data page pointer (DP) 2-8, 3-4, 6-5 
data structure 
FIR filters 6-31 
data transfer modes 11-28 
data transfer operation 12-6 
data-address generation logic 
definition A-2 
data-page pointer 
definition A-2 
DBcond instruction 14-86 
DBcondD instruction 14-88 
DBR instruction 8-9 
DE bit 9-7 
decode phase 
definition A-2 
delayed branches 7-9 
conditional 7-9 
disabled interrupts 7-9 
example 7-10 
incorrectly placed 7-10 
example 7-7 
with annul option 8-7 
with annulling 7-11 
without annul option 8-6 
example 8-6 
without annulling 7-10 
destination address register 11-7 
destination address-index register 11-7 
development support applications — viii, xv 
DIE register bit functions 
DMA split mode 11-45 
direct addressing 6-5 
example 6-5 
figure 6-5 
direct memory access. See DMA coprocessor 
displacement 
indirect addressing 
table 6-7 
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displacements 6-6 to 6-21 


DMA channel control register 7-35, 11-7 

bit definitions 11-8 

data transfer modes 11-28 

field descriptions 11-8 

modifiable by autoinitialization in split 
mode 11-40 

modifiable by autoinitialization in unified 
mode 11-39 

modifiable by autoinitialization of auxiliary chan- 
nel 11-40 


DMA channel control register bit definitions 
AUTOINIT STATIC 11-8 
AUTOINIT SYNC 11-9 
AUX AUTOINIT STATIC 11-9 
AUX AUTOINIT SYNC 11-9 
AUX START 11-10 
AUX STATUS 11-11 
AUX TCC 11-10 
AUX TCINT FLAG 11-10 
AUX TRANSFER MODE _ 11-8 
COM PORT 11-9 
DMA PRI 11-8 
PRIORITY MODE 11-11 
READ BIT REV 11-9 
SPLIT MODE 11-9 
START 11-10 
STATUS 11-11 
SYNC MODE 11-8 
TCC 11-10 
TCINT FLAC 11-10 
TRANSFER MODE _ 11-8 
WRITE BIT REV 11-9 


DMA channel registers 
(SPLIT MODE=1, auxiliary transfer counter = 0) 
figure 11-36 
storage in memory (SPLIT MODE=0) 
figure 11-35 
storage in memory (SPLIT MODE=1) 
figure 11-36 


DMA channel running 

transfer mode 102 
figure 11-29, 11-30 

transfer mode 112 
figure 11-31, 11-33 


DMA control register bits 
effect 11-38 


DMA controller 2-19 
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DMA coprocessor 2-23 

address generation 
figure 11-16 
address registers 11-15 
arbitration 11-27 
autoinitialization 11-34 
auxiliary channel 11-20 
block transfer sequence 11-5 
buses 2-19 
channel arbitration 11-24 
channel configuration 
figure 11-19 

channel register map 4-9, 11-4 
channel synchronization 11-43 to 11-46 
communication ports coordination 12-17 
definition A-2 
destination synchronization 11-48 
features 11-2 
functional description 11-3 
index registers 11-15 
interrupts 11-42 
introduction 11-2 
link-pointer register 11-17 
memory map 11-4 
operational modes 11-3 
primary channel 11-20 
priorities 11-22 
priority wheel 11-24 
registers 11-5, 11-7 
six channels 2-23 
source and destination synchronization 11-49 
source synchronization 11-48 
split mode 3-10, 11-20 
timing 11-51 
transfer count register 11-16 
transfer modes 11-28 
unified mode 3-8, 11-19 

DMA coprocessor memory map 
figure 4-9 

DMA destination synchronization 
figure 11-49 

DMA interrupt enable register (DIE) 2-9, 3-8, 11-44 
bit functions 3-10 
definition A-3 

DMA interrupts 7-26 
control bits 7-26 
CPu interaction 7-28 
processing 7-27 

DMA memory transfer timing 
single 11-51 
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DMA PRI and CPU/DMA arbitration rules 11-27 
table 11-12 


DMA registers intialization 11-5 


DMA source and destination sync 
unified mode 11-50 


DMA source synchronization 
figure 11-48 


DMA start 11-5 


DMA timing for different synchronizations 
split mode 
figure 11-56 
unified mode 
figure 11-55 


DMA transfers 
timing and number of cycles to global bus 
figure 11-54 
timing and number of cycles to local bus 
figure 11-53 
timing and number of cycles to on-chip 
figure 11-52 


DMAINTx flag 3-15 
documentation vii 


dual-access RAM 
definition A-3 


edge-triggered interrupts 7-15 
EDMAINTx bit 3-12 
EICFULLx bit 3-12 
EICRDYx bit 3-12 
EIIFOx bit 3-14 
EOCEMPTYx bit 3-12 
EOCRDYx bit 3-12 
ETINTO bit 3-12 
ETINT1 bit 3-12 
execute only 8-10, 8-13 
parallel store 8-14 
single store 8-13 
expansion register file 2-10, 3-17 
interrupt vector table (IVT) 7-16 
exponent field 
definition 5-8 
extended precision registers 3-3 
extended-precision floating-point format 
definition A-3 


extended-precision register 
definition A-3 
floating-point format 3-4 
integer format 3-4 


external bus 
control registers. See memory interface control 
registers 
interface signals 9-3 


external bus operation 9-1 to 9-50 
overview 9-2 


external buses (global, local), wait states 9-14 


external interrupts 2-21, 7-21 
definition A-3 
external memory interface registers 7-35 


features comparison 1-4 


FIFO buffer 
definition A-3 
FIFOS 
halting 12-14 
FIR filters 


circular addressing 6-31 
data structure 6-31 


FlX instruction 5-31, 14-90 
FIX||STl instruction 14-92 
fixed priority 11-22 

FLAGx bit 3-14 

FLOAT instruction 5-33, 14-94 
FLOAT||STF instruction 14-96 


floating point 
addition 5-23 
conversion to integer 5-31 
extended-precision format 5-7 
format conversion 5-11 
formats 5-4 
normalization 5-23, 5-27 
reciprocal 5-34 
register format 3-4 
rounding value 5-29 
single-precision format 5-6 
floating point (continued) 
subtraction 5-23 
underflow 5-24 
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floating-point 
determining decimal equivalent 5-8 
extended-precision format 5-7 
general format 5-4 
multiplication 5-19 
short format 5-5 
figure 5-5 
single-precision format 
figure 5-6 
floating-point addition 
32-bit shift 5-26 
example 5-25 
floating-point addition/subtraction 
example 5-26 
floating-point formats 
IEEE Std. 754 5-13 
supported types 5-4 
floating-point multiplication 
chart 5-20 
floating-point multiply 
mantissa=1.0 5-22 
mantissa=1.5 5-21 
mantissa = 2.0 5-21 
positive and negative numbers 5-22 
floating-point operation 
introduction 5-1 
floating-point rounding 
flowchart 5-30 
floating-point subtraction 
example 5-25 
floating-point to integer conversion 
flowchart 5-32 
floating-point values 
fractional 5-10 
negative 5-10 
positive 5-9 
floating-point/integer multiplier 2-4 
format 
conversion 
‘C4x tolIEEE 5-17 
conversions 
IEEE std. 754 5-13 
formats 
conversion 
floating-point 5-11 
See also conversion of formats 
formats (continued) 
signed integer 5-2 
unsigned integer 5-3 
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FRIEEE instruction 14-98 
FRIEEE||STF instruction 14-99 
FUNCx bit 3-14 


general addressing modes 2-18, 6-21 
encoding 6-22 
general-purpose applications viii 
GIE bit 3-7 
global and local memory 
interface control signals 9-3 
global memory 9-39, 9-41, 9-43 
interface 2-20, 9-2 
table 9-4 
global memory port status 
STRBO and STRB1 accesses 9-5 
graphics/imagery applications _ viii, xi 


halting of FIFOs 12-14 
hardware interrupt 
definition A-3 
hit 
cache 4-14 
definition A-3 
hold everything 8-10, 8-15 
busy external port 8-15 
conditional calls and traps 8-16 
multicycle data reads 8-16 


IACK 

definition A-3 
TACK instruction 9-49, 14-100 
IACK pin 9-49 

timing 9-49 
ICFULL interrupt 

description 12-17 

enabling 3-12 
ICRDY flag 

interrupt use 11-46 
ICRDY interrupt 

description 12-17 
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enabling 3-12 
interrupt use 3-9, 3-10, 11-44, 11-45 
IDLE instruction 14-102, 14-103 
IEEE std. 754 (conversions) 5-13 
IEEE to ’C4x conversion 
example 5-16 
IIE register 3-11 
lIF register 7-17 
IIF register modification 3-13 
figure 7-18 
IIOF flag register (IIF) 2-9, 3-13 
definition A-4 
lOF pins 
boot loader use 10-5 
modification 10-19 
immediate addressing 6-18 
example 6-18 
index registers (IRO, IR1) 2-8, 3-4, 11-15 
definition A-4 
indirect addressing 6-6 
displacement 6-7 
flexibility 6-6 
index register IR1 6-8 
index register IRO 6-7 
operand coding 
figure 6-6 
postdisplacement add and circular modify 6-12 
postdisplacement add and modify 6-11 
postdisplacement subtract and circular 
modify 6-13 
postdisplacement subtract and modify 6-12 
postindex add and bit-reversed modify 6-17 
postindex add and circular modify 6-16 
postindex add and modify 6-15 
postindex subtract and circular modify 6-17 
postindex subtract and modify 6-16 
predisplacement add 6-9 
predisplacement add and modify 6-10 
predisplacement subtract 6-10 
predisplacement subtract and modify 6-11 
preindex add 6-13 
preindex add and modify 6-14 
preindex subtract 6-14 
preindex subtract and modify 6-15 
special cases 6-8 
individual instruction descriptions 14-20 
input and output FIFO halting 
summary 12-14 
input FIFO channel 12-3 


instruction cache 4-10 
architecture 4-10 
figure 4-11 
reset 4-12 
instruction register (IR) 2-19 


instruction set summary 14-2 to 14-11 
functional groups 14-2 


instructions 
See also assembly language 
interlocked 9-44 

integer 
short format 5-2 
short unsigned format 5-3 
signed formats 5-2 
single-precision unsigned format 5-3 
single-precision format 5-2 
unsigned formats 5-3 


integer formats 

short integer 5-2 

signed 5-2 
interlocked instructions 2-20, 9-39, 9-44 
interlocked operations 9-39 


interlocked operations instructions 
table 14-8 


internal buses 2-4, 2-19 

internal interrupt 
definition A-4 

internal interrupt enable register (IIE) 2-9, 3-11 
definition A-4 

internal interrupts 7-18 

interrupt 
definition A-4 

interrupt acknowledge (IACK) 
definition A-4 

interrupt acknowledge (IACK) instruction 7-20 

interrupt flag register (IIF) 
figure 3-14 

interrupt latency 
table 7-21 

interrupt service routine (ISR) 
definition A-4 

interrupt vector table (IVT) 7-15 
boot loader use 10-8 
definition A-4 

interrupt vector table pointer (IVTP) 
definition A-4 
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interrupts 2-21 
control bits 7-17 
DMA 7-28, 11-42 
DMA interaction 7-28 
edge triggered 11-42 
edge-triggered 11-42 
external 2-21, 7-21 
initialization 7-24 
initiation condition 7-15 
latency 7-20 
level triggered 11-42 
level-triggered 11-42 
NMI 7-22 
overlapping the IVT and TVT 7-25 
prioritization 7-15 
processing 7-18, 7-19, 7-27, 7-28 
timer 13-12 
vector table 7-16 
vectors 7-28, 13-11 


ISR. See interrupt service routine (ISR) 
IVTP. See interrupt vector table (IVT) 
IVTP register 2-10, 3-17 


jumps 7-12 


LAO-LA30 

definition A-4 
LAJ instruction 7-13, 14-105 
LAJcond instruction 7-13, 14-106 
LATcond instruction 7-13, 7-25, 14-107 
LBb instruction 14-108 
LBUb instruction 14-110 
LDO-LD31 

definition A-4 
LDA instruction 14-111 
LDE instruction 14-112 
LDEP instruction 14-114 
LDF instruction 14-115 
LDF\|LDF instruction 14-121 
LDF||STF instruction 14-123 
LDFcond instruction 14-117 
LDFl instruction 9-39, 9-45, 14-119 
LDHl instruction 14-125 
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LDl instruction 14-126 
LDI||LDl instruction 14-132 
LDI||STl instruction 14-134 
LDIcond instruction 14-128 
LDll instruction 9-39, 9-45, 14-130 
LDM instruction 14-136 

LDP instruction 14-138 
LDPE instruction 14-139 
LDPK instruction 14-140 
level-triggered interrupts 7-15 
LHUw instruction 14-143 
LHw instruction 14-141 


link pointer 
incrementing 11-36 
reference to 11-41 
self referential 11-41 


link pointer registers 11-7 
figure 11-18 


literature — vii 
LLOCK signal 9-44 


load and store instructions 
table 14-3 


loading sequence 
bootloader 10-10 


local memory interface 2-20, 9-2 
LOCK signal 9-44 


low IIOF signal 
circuit diagram 10-19 


LRU algorithm 4-14 
LRU stack 4-12 
LSB 

definition A-5 
LSH instruction 14-145 
LSH3 instruction 14-147 
LSH3]|STI instruction 14-149 
LUF flag 3-5 
LV flag 3-5 
LWLet instruction 14-152 
LWRect instruction 14-154 


machine values 14-21 


Index-10 


mantissa 


definition 5-8, A-5 


mapping addresses to strobes 9-12 
maskable interrupt 


definition A-5 


MBct instruction 14-156 
medical applications _ viii, xiv 
memory 2-11, 8-10 


See also memory interface 
accesses 
pipeline 8-19 
timing 8-19 
addressing modes 2-18 
aliasing 2-17 
block diagram 2-12 
cache 4-10, 4-13 
communication ports memory map 12-7 
control registers. See memory interface 
global 9-39, 9-41, 9-43 
introduction 4-1 
maps 2-13, 4-2 
maximizing pipeline performance 8-17 
memory maps 
communication ports 4-8 
DMA_ 11-4 
timer registers 4-7, 13-5 
organization 2-11, 4-2 
parallel multiplies and adds_ 8-23 
parallel stores 8-21 
pipeline conflicts 8-10 
ranges 9-10 
registers. See memory interface control registers 
ROMEN pin 4-2 
sharing 9-42 
signal-group control 9-38 
space 2-11 
three-operand accesses 8-20 
timing 8-19, 9-16 
two-operand accesses 8-20 


memory accesses 


data access 8-17 

data loads and stores 8-20 
external program fetches 8-19 
internal clock 8-19 

internal program fetches 8-19 
program fetch 8-17 

two data accesses 8-18 


memory cache 


rules for efficient usage 4-13 


memory conflicts 8-4 


memory interface 
address ranges 9-11 
control registers 9-6 
control signals 9-3 
page size 9-9 
PAGESIZE field. See memory interface control 
registers 


memory interface (local, global) 
features 9-2 
ready generation 9-14, 9-16 
timing 9-16 
wait states 9-14 
memory interface control registers 4-6 
address ranges 9-10 
bit contents 9-7 
fields 
figure 9-7 
figure 4-6 
reset effect 9-6 
STRBx SWW field 9-15 
timing 9-16 
wait states 9-14 
memory load 
flow chart 10-6 
memory map 4-2 
analysis module registers 4-6 
C44 2-15 
communication ports 4-8, 12-7 
DMA coprocessor 
figure 4-9 
DMA 4-9, 11-4 
global memory bus 9-12 
peripheral 2-16 
timer registers 4-7, 13-5 
memory-mapped register 
definition A-5 
MFLOPS 
definition A-5 
MHct instruction 14-158 
microcomputer mode 
definition A-5 
microprocessor mode 
definition A-5 
military applications _ viii, xiii 
MIPS 
definition A-5 
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miss 

cache 4-14 

definition A-5 
mode selection 

bootloader 10-3 
mode selection flow 

figure 10-4 
module reset 12-29 
MPYF instruction 14-160 
MPYFS3 instruction 14-162 
MPYF3||ADDF3 instruction 14-163 
MPYF3||STF instruction 14-165 
MPYF8||SUBF3 instruction 14-167 
MPYl instruction 14-169 
MPYI3 instruction 14-171 
MPYI3||ADDIS3 instruction 14-173 
MPYI3||STl instruction 14-175 
MPYI3||SUBIS3 instruction 14-177 
MPYSHI instruction 14-179 
MPYSHIS instruction, 14-180 
MPYUHI instruction 14-182 
MPYUHIS instruction 14-183 
MSB 

definition A-5 
multimedia applications _ viii, xiii 
multiplication 

floating-point 5-19 
multiplier 

definition A-5 

multiply or CPU operation 
parallel store 8-21 


N flag 3-5 
NEGB instruction 14-185 
NEGF instruction 14-187 
NEGFI|STF instruction 14-189 
NEGl instruction 14-191 
NEGI||STl instruction 14-193 
Newton-Raphson algorithm 
example 5-35 
reciprocal square root 5-38 
NMI 7-22 
NMI bus grant field 3-7 
NMI flag 3-15 
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no DMA synchronization 
figure 11-47 
nonmaskable interrupt (NMI) 
definition A-5 
NOP instruction 14-195 
NORM instruction 5-27, 14-196 
execution 5-28 
flowchart 5-27 
normalization 
floating point value 5-23, 5-27 
NOT instruction 14-198 
NOT||STl instruction 14-200 


object values 
three-operand instructions 6-23 
OCEMPTY interrupt 
description 12-17 
enabling 3-12 
OCRDY flag 
interrupt use 11-46 
OCRDY interrupt 
description 12-17 
enabling 3-12 
interrupt use 3-9, 3-10, 11-44, 11-45 
operational overview 
communication ports 12-3 
OR instruction 14-202 
OR instruction 14-204 
OR9||STI instruction 14-206 
output FIFO channel 12-3 
output value formats 14-12 
overflow 5-24, 5-31 
overflow flag (OV) bit 
definition A-6 
OVM flag 3-6 


P flag (cache) 4-10 

page size 9-9 

page size operation 9-13 

parallel addressing modes 2-18, 6-24 


parallel instructions 
table 14-9 


Index-12 


parallel multiplies and adds 
figure 8-23 
parallel multiply with ADD/SUB 
encoding 6-24 
PAU. See port arbitration unit 
PAU state definitions 12-11 
PCF bit 3-6 
PC-relative addressing 
encoding 6-20 
example 6-19 
period register 13-5 
peripheral 
memory map 4-5 
peripheral modules 
figure 2-22 
peripheral bus 2-22 
definition A-6 
general architecture 2-22 
memory map 4-5 
peripheral memory map 2-16 
peripherals 
communication port 2-23 
PGIE bit 3-7 
pin states 
table 7-29 to 7-35 
pipeline 
conflicts 8-4 
branch 8-4 
memory 8-10 
register 8-8 
resolving (memory) 8-17 
decode unit 8-2 
definition A-6 
execute unit 8-2 
fetch unit 8-2 
four major units 8-2 
introduction 8-1 
memory accesses 8-19 
read unit 8-2 
structure 8-2 
pipeline structure 
figure 8-3 
POP instruction 14-208 
POPF instruction 14-209 
port arbitration unit 12-3, 12-11 
previous cache freeze (PCF) bit 4-13 
primary register file (CPU) 2-6, 3-2 
prioritization 7-15 


priority wheel (DMA) 11-24 
program 
buses 2-19 


program control instructions 
table 14-7 
program counter (PC) 2-9, 2-19, 3-16 
definition A-6 
program fetch 
multicycle program memory fetches 8-12 
program fetch incomplete 8-10, 8-12 


program wait 8-10 
due to multicycle access 8-12 
wait until CPU data access completes 8-11 


PUSH instruction 14-210 
PUSHF instruction 14-211 


RAM 2-11 
RC register 7-4 
RCPF instruction 5-34, 5-35, 14-212 
RCPF instruction algorithm 
figure 5-34 
readofAR 8-9 
read/write (R/W) pin 
definition A-6 
ready 
generation 9-14 
timing 9-16 
reciprocal (RCPF instruction) 5-34 
reciprocal algorithm 5-35 
reciprocal square root (RSQRF instruction) 5-36 
register addressing 6-3 
register bit functions 
DMA unified mode 
figure 3-8 
register buses 2-19 
register conflicts 8-4 
register file 
definition A-6 
registers 2-6, 2-7 
auxiliary (ARO-AR7) 2-6, 3-4 
block repeat (RC, RE, RS) 3-16 
block size (BK) 2-8, 3-5 
data page pointer (DP) 2-8, 3-4, 6-5 
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DMA interrupt enable (DIE) 2-9, 3-8 

expansion register file 2-10 

extended precision 3-3 

extended precision (RO-R11) 2-6 

IIOF flag register (IIF) 2-9, 3-13, 7-17, 7-26 

index (IR1, 1RO) 3-4 

input port 12-9 

internal interrupt enable (IIE) 2-9, 3-11, 3-12 

output port 12-9 

pipeline conflicts 8-8 

program counter (PC) 2-9, 2-19, 3-16 

repeat count (RC) 2-9, 3-16, 7-2 

repeat end address (RE) 3-16, 7-2 
See also repeat block (RC, RE, RS) 

repeat mode 7-2 

repeat start address (RS) 3-16, 7-2 
See also repeat block (RC, RE, RS) 

stack pointer (SP) 2-8, 3-5 

status register (ST) 2-9, 3-5, 14-13 

timer counter 13-8 

timer global control 13-6 


repeat count register (RC) 2-9, 3-16, 7-2 
definition A-6 


repeat end address register (RE) 7-2 


repeat mode 7-2 
block (RPTB) 7-2 
block delayed (RPTBD) 7-2 
control bits 7-3 
definition A-6 
nesting 7-8 
operation 7-3 
RC value after completion 7-7 
restriction rules 7-6 
RPTB instruction 7-4 
RPTBD instruction 7-4 
RPTS instruction 7-5 
single instruction (RPTS) 7-2 


repeat mode flag 3-6 
repeat mode registers 7-2 
repeat start address register (RS) 7-2 


repeat-mode control algorithm 
example 7-4 


reserved bits 3-16 


reset 7-29 
additional operations 7-35 
communication ports 12-29 
definition A-6 
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reset (continued) 
memory interface control registers 9-6 
pin states 7-29 
vector location 7-35 
vectors 7-28 


RESET pin 12-29 
reset pin 
definition A-6 
RESETLOC pins 10-10 
RETIcond instruction 7-13, 7-24, 14-213 
RETIcondD instruction 7-13, 7-24, 14-214 
RETScond instruction 7-12, 14-215 
returns 7-12 
RM bit 7-3 
RM flag 3-6 
RND instruction 5-29, 14-216 
ROL instruction 14-218 
ROLC instruction 14-219 
ROM 2-11, 2-13 
ROMEN 2-13 
definition A-6 
ROMEN pin 2-11, 4-2 
ROR instruction 14-221 
RORC instruction 14-222 
rotating priority 11-22 
rotating priority DMA 
read and write sequence 11-23 


rotating priority mode 
figure 11-23 
rounding of floating-point value 5-29 
RPTB instruction 7-2, 7-8, 8-5, 14-223 
pipeline conflict 7-7 


RPTB operation 
example 7-4 
RPTBD instruction 7-2, 7-8, 14-224 
RPTS execution 
steps 7-6 
RPTS instruction 7-2, 8-5, 14-226 


RSQRF instruction 5-36, 14-228 
algorithm 
figure 5-37 


Sbit 7-3 


Index-14 


semaphores 9-43 
service sequence 
split mode priority 11-26 
SET COND bit 
short floating-point format 
definition A-7 
short integer format 
definition A-7 
short unsigned-integer format 
definition A-7 
SlGl instruction 9-39, 9-47, 14-229 
signed-integer formats 5-2 
sign-extend 
definition A-7 
single-precision floating-point format 
definition A-7 
single-precision integer format 
definition A-7 


single-precision unsigned-integer format 


3-7 


definition A-7 
software interrupt 

definition A-7 
source address register 11-7 
source address-index register 


source and destination synchronization 


source synchronization 11-48 
speech/voice applications _ viii, xi 
split mode 3-10 

definition A-7 
split mode (DMA) 11-20, 11-35 
split-mode DMA configuration 


figure 


split-mode synchronization interrupts 


11-21 


table 3-11 


stack 


definition A-7 
stack pointer (SP) 2-8, 3-5 


standard (nondelayed) branches 8-4 


standard branch 7-9 
example 8-5 
START field descriptions 


table 


11-14 


state diagram 
port arbitration unit 12-12 


STATUS field descriptions 


table 


11-14 


11-7 


11-49 


status register (ST) 2-9, 3-5, 14-13 

definition A-7 

figure 3-5 
STF instruction 14-230 
STFI|STF instruction 14-234 
STFl instruction 9-39, 9-46, 14-232 
STlinstruction 14-235 
STI||STl instruction 14-237 
STll instruction 9-39, 9-46, 14-236 
STIK instruction 14-239 
STRB ACTIVE 9-8 
STRB SWITCH 9-8 
STRBO PAGESIZE 9-8 
STRBO SWW 9-8 
STRBO WTCNT 9-8 
STRB1 PAGESIZE 9-8 
STRB1 SWW 9-8 
STRB1 WTCNT 9-8 
STRBx PAGESIZE fields 

figure 9-13 
strobe settings 9-7 
strobes 9-12 

timing 9-16 
style (manual) — iv 
SUBB instruction 14-240 
SUBB3 instruction 14-242 
SUBC instruction 14-244 
SUBF instruction 14-246 
SUBF3 instruction 14-248 
SUBF3]||STF instruction 14-250 
SUBI instruction 14-252 
SUBI3 14-254 
SUBI3]|STI instruction 14-256 
SUBRB instruction 14-258 
SUBPRF instruction 14-260 
SUBRI instruction 14-262 
subtraction 

floating-point 5-23, 5-25 
SWl instruction 14-264 
symbols 14-16 
symbols (used in manual) iv 


sync mode 
transfer rate 11-55 


SYNC MODE and AUTOINIT MODE bits 


autoinitialization 
table 11-38 
SYNC MODE bits 11-46 
SYNC MODE field descriptions 
split mode 
table 11-13 
unified mode 
table 11-13 
synchronization 11-37 
destination 11-48 
DMA channels’ 11-43 
source 11-48 
source and destination 11-49 
synchronization interrupts 
DMA channels _ 3-9 
synchronizer delays 12-28 
synchronizers 12-26 


task counter example 9-42 
TCLK 13-4 
TCLK as an input 
figure 13-15 
TCLK as an output 
figure 13-15 
technical assistance xvi 
telecommunications applications viii, xiii 
three-operand addressing modes 2-18, 6-22 
encoding for type 1 6-24 
three-operand instruction word 8-20 
three-operand instructions 
table 14-6 
timer 
definition A-7 
interrupts 
considerations 13-12 
timer clock setup 
maximum setup 13-16 
timer configuration 
CLKSRC =0 and FUNC =0 13-14 
CLKSRC =0 and FUNC =1 13-14 
CLKSRC = 1 andFUNC =0 13-13 
CLKSRC = 1 and FUNC =1= 13-13 
timer control register bit summary 
C/P 13-7 
CLKSRC_ 13-7 
DATIN 13-6 


Index 


Index-15 


Index 


timer control register bit summary (continued) 

DATOUT 13-6 

FUNC 13-6 

GO 13-6 

HLD 13-6 

VO 13-6 

INV 13-7 

TSTAT 13-7 


timer global control register 
diagram 
bitsummary 13-6 
timer output generation 
examples 13-10 
timer pins 13-4 
timer pulse mode 
clock mode timing 13-9 
timer registers 4-7, 7-35 
figure 4-7 
timer-period register 
definition A-7 
timers 2-24, 13-2, 13-2 to 13-3 
boundary conditions 13-8 
control registers 13-5 
counter register 13-2, 13-8 
global control register 13-6 
VO pin 2-24 
initialization 13-16 
interrupts 13-11 
operation 13-11 
introduction 13-1 
operation nodes 13-10 
period register 13-2, 13-7 
pulse generation 13-9 
selecting CLKSRC 13-13 
selecting FUNC 13-13 
TCLK 
general-purpose /O 13-15 
timing 
DMA channels 11-51 
IACK pin 9-49 
memory access 9-16 
STRB, RDY 9-16 


TINTO flag 3-15 
TINT1 flag 3-15 


TMS320C40 
introduction 1-2 


TMS320C44 
introduction 1-2 


Index-16 


TMS320C4x 
features comparison 1-4 
introduction 1-1 
key features 1-3 
TMS320C4x devices 1-2 
TMS320LC40 
introduction 1-2 
TOIEEE instruction 14-265 
TOIEEE||STF instruction 14-266 
token 12-3 
token transfer 
figure 12-20 
operation 12-5, 12-19 
token transfer sequence 
table 12-21 
transfer counter registers 
figure 11-17 
TRANSFER MODE = 002 
running 11-28 
TRANSFER MODE = 012 
running 11-29 
TRANSFER MODE = 102 
running 11-29 
TRANSFER MODE = 112 
running 11-31 
TRANSFER MODE field descriptions 11-28 
table 11-12 
transfer rate 
sync mode 11-55 
trap flow 
figure 7-24 
TRAP instruction 7-25 
trap vector table (TVT) 
boot loader use_ 10-9 to 10-10 
definition A-7 
trap vector table pointer (TVTP) 
definition A-7 
TRAPcond instruction 7-12, 14-267 
traps 7-12, 7-24 
initialization 7-24 
operation 7-24 
overlapping the TVT and IVT 7-25 
vector table 7-25 
TSTB instruction 14-268 
TSTB3 instruction 14-270 
TVTP 3-17 
See also trap vector table (TVT) 


TVTP register 2-10 
two parallel stores 
figure 8-22 
two-operand instruction word 8-20 


two-operand instructions 
table 14-4 


type one synchronizer 
maximum delay 12-26 
minimum delay 12-26 


type three synchronizer 
maximum delay 12-28 
minimum delay 12-27 


type two synchronizer 
maximum delay 12-27 
minimum delay 12-27 
TYPEx bit 3-14 


UF flag 3-5 
underflow 5-23 
unified mode 
definition A-8 
unified mode (DMA) 11-19, 11-35 
unsigned-integer formats 5-3 


V flag 3-5 

value in a floating-point number 
equation 5-4 

vector locations 
table 7-35 

vectors (reset, interrupts) 7-28 


wait state 
definition A-8 


wait states 9-14, 9-36, 9-37 
bus disabled 9-38 


wait-state generator 
definition A-8 


word transfer 
operation 12-22, 12-23 


word transfer sequence 
table 12-24 


word transfers 11-5 
write to AR 8-8 


XOR instruction 14-272 
XOR3 instruction 14-274 
XOR3||STI instruction 14-276 


Zflag 3-5 
zero fill 
definition A-8 
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Reader Response Card: TMS320C4x User’s Guide 


Please respond to a few questions to help us provide you 
with the best documentation possible. 


What is your primary use for the in- 


Which topics were difficult to finc 
and why (for example, the topi 
wasn't not in a logical location)? 


formation in this manual? 


[_] Designing ’C4x-based hardware 
[_] Designing ’C4x-based software 


How have you used this manual? 


[_] To look up specific information or procedures when 
needed (as a reference) 


What any other suggestions do yo 
have for improving this book? 


[(_] To read chapters about subjects of interest 
[_] To read from front to back before using the information 


Have you found any mistakes or un- 
clear information in this manual 
(please describe and include page 
numbers)? 


Which topics should be described in 
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